Sr. Hadoop/Spark Developer Resume Kaiser Permanente, CA - Hire IT People

SUMMARY:

10.5 years of working in IT Industry experience including 4+ years of experience in Big Data Analytics and development.
Having good experience in Big data related technologies like Hadoop Frameworks, SparkCore, Spark Streaming, Hive, Sqoop, Kafka, Flume, Oozie
Excellent knowledge on Hadoop ecosystems such as HDFS, YARN, Name Node, Data Node, Utility Node, Edge/Gateway Node and Map Reduce programming paradigm
Experience in writing queries for moving data from HDFS to Hive and analyzing data using Hive - QL
Excellent knowledge of Partitions, bucketing, Join optimization, Query optimization concepts in Hive
Experience in importing and exporting data using Sqoop from Relational Database System (RDBMS), Hive into HDFS
Experience in optimizing and tuning Hive, Spark, Mapreduce jobs to meet performance requirements
Experience in Spark Streaming with live data stream. Used Flume and Kafka to ingest data into Spark Streaming
Experience in Spark Core, Spark Sql, Spark Streaming, Data Frames, RDD's, Scala for Spark
Experience in using Streams, Accumulator variables, Broadcast variables,RDD caching for Spark Streaming
Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase
Experience in Oozie and workflow scheduler to manage Hadoop jobs
Familiarity with the Hadoop Architecture, Design of data ingestion pipeline, data mining and modeling and machine learning

TECHNICAL SKILLS:

BigData/Hadoop: HDFS, YARN, MapReduce, Hive, Pig, Impala, Sqoop, Flume,Kafka, Spark, Spark Streaming, Oozie, HBase

Languages: Scala, Python, Java, SQL, PL/SQL, HiveQL, Pig Latin, Shell Scripting

Databases: MySQL, Oracle

BI Tools: Tableau

Development Tools: Eclipse, PyCharm, Toad, SqlDeveloper

PROFESSIONAL EXPERIENCE:

Confidential, CA

Sr. Hadoop/Spark Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop
Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model that gets the data from Kafka in near real time and persists into HBase.
Involved in performance tuning and porting the system from Hive to Spark
Developed Scala scripts using both Data frames/SQL and RDD in Spark for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
Performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
Optimizing existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
Experience in scheduling jobs using Oozie workflow.
Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as well as RDBMS and NoSQL data stores for data access and analysis
Worked extensively with Sqoop for importing/exporting data from/to Oracle
Involved in creating Hive tables, and loading and analyzing data using hive queries
Performance tuning of hive queries / jobs.
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE
Developed Hive queries to process the data and generate the data cubes for visualizing
Implemented schema extraction for Avro file Formats in Hive.
Experience with Talend open studio for designing ETL Jobs for Processing of data
Used Reporting tools like Tableau to connect with Hive for generating daily reports of data
Collaborated with the infrastructure, network, database, application and BI teams to ensure the data quality and availability

Confidential

Hadoop Developer

Responsibilities:

Prepare technical design documents based on business requirements and prepare data flow diagrams.
Integrated Hadoop with Oracle in order to load and then cleanse raw structured data in Hadoop ecosystem to make it suitable for processing in Oracle using stored procedures and functions.
Used SQOOP for importing data into HDFS and exporting data from HDFS to oracle database
Developed Oozie Workflows for daily incremental loads, which gets data from Oracle and then imported into hive tables.
Implemented Data Integrity and Data Quality checks in Hadoop using Hive and Linux scripts
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Developed Hive UDFs for the needed functionality that is not out of the box available from Apache Hive.
Optimized the Hive tables using optimization techniques like Partitioning, Dynamic Partitions, and Buckets to provide better performance with HiveQL queries.
Developed Pig Scripts for ETL kind of operation on captured data and delta record processing between newly arrived data and already existing data in HDFS.
Used hive to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
Worked in creating HBase tables to load large sets of semi structured data coming from various sources.
Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
Support/Troubleshoot hive programs running on the cluster and Involved in fixing issues arising out of duration testing
Good experience in handling data manipulation using python Scripts.
Worked extensively in performance optimization by adopting/deriving at appropriate design patterns of the MapReduce, Hive jobs by analyzing the I/O latency, map time, combiner time, reduce time etc.
Actively involved in code review and bug fixing for improving the performance.
Experience in Daily production support to monitor and trouble shoots Hadoop/Hive jobs
Created HBase tables to store various data formats of incoming data from different portfolios.
Troubleshooting: Used Hadoop logs to debug the jobs execution

Confidential

Sr. Database Engineer

Responsibilities:

Coordinating offshore team
Analyze business requirements
Prepare technical design documents based on business requirements.
Implement new design as per technical specifications.
Writing the Advance PL/SQL ETL scripts for onetime data migration of Multi Billion of records i.e. Writing complex SQL queries and PL/SQL procedures to extract data from various source tables, Creating Database Objects like Tables, indexes, views, sequences, synonyms, tables, partitions, Global Temporary Tables, External Tables etc.
Played key role in optimizing the migration scripts which included creating indexes and providing Hints using dbms state, explain plan, trace, tkproof utility so that it can complete data migration tasks within deployment window
Writing audit scripts for auditing the migrated data.
Development: Creating stored procedures, functions, Packages, Database triggers, constraints, indexes, grants and sequences based on business requirements
Developed back end interfaces using PL/SQL packages, stored procedures, functions, Collections, Object Types, Oracle queues.
Created PL/SQL scripts to extract the data from the operational database into simple flat text files using UTL FILE package

Confidential

Sr. Database Engineer

Responsibilities:

Analyzing the requirement
Technical Design document preparation
Development: Writing procedure, functions, packages, triggers using SQL, PL/SQL
Performance tuning, which included creating indexes and providing Hints using explain plan, trace, tkproof utility
Front-end development using Forms 10g and Reports in Reports10g
Interacting with client to gather change requests, analyzing the affect of new changes.
UT and SIT
Analyzing and fixing the defects in various stages of testing cycle.
Provide production support for the deployed project till it is stabilized
Got appreciation from client for creating multilevel approval process (work flow) that is applied in overall application.

Confidential

Oracle Developer

Responsibilities:

Analyzing the requirement
Technical Design document preparation
Development: Writing procedure, functions, packages, triggers using SQL, PL/SQL
Performance tuning, which included creating indexes and providing Hints using explain plan, trace, tkproof utility
Development of many new, Enhancement and customizing of Forms, Oracle Reports

We provide IT Staff Augmentation Services!

Sr. Hadoop/spark Developer Resume

Kaiser Permanente, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship