Sr. Hadoop Developer Resume
Atlanta, GA
SUMMARY:
- 5 years of professional experience in IT industry, with 3 years of experience in Hadoop ecosystem's implementation, maintenance, and Big Data analysis operations.
- Excellent understanding of Hadoop architecture and underlying framework including storage management.
- Excellent understanding of Hadoop Gen 2 and Gen 1 concepts.
- Experience in using various Hadoop infrastructures such as MapReduce, Pig, Hive, ZooKeeper, HBase, Sqoop and Oozie for data storage and analysis.
- Experience with Oozie Scheduler in setting up workflow jobs with Map/Reduce and Pig jobs.
- Knowledge of architecture and functionality of NOSQL DB like HBase and Cassandra.Experience in troubleshooting errors in HBase Shell/API, Pig, Hive and MapReduce.
- Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
- Knowledge about AutoSys as a scheduler to run jobs on the Hadoop cluster. Knowledge in Streaming the Data to HDFS using Kafka.
- Experience in working with Hadoop clusters using Cloudera (CDH - 5.9) distributions.
- Hands on experience in using Map reduce programming model for Batch processing of data stored in HDFS
- Expertise in writing ETL Jobs for analyzing data using Pig.
- Experience in NoSQL Column-Oriented Databases like HBase and its Integration with Hadoop cluster
- Experience in Build and Release tools for deployment on HDFS.
- Oracle Pl/Sql Developer experience in System study, Requirements gathering, Analysis, Design, Development, Testing, Troubleshooting and Implementation of full life cycle of the Application.
- Excellent time management skills with proven ability to work accurately and quickly to prioritize, coordinate and consolidate tasks, whilst simultaneously managing the diverse range of functions.
- Diligent team builder with ability to objectively evaluate results; also possesses excellent written, oral and interpersonal communication skills.
- Strong logical and analytical skills leading to problem resolutions within timelines.
- Committed and dedicated for continuous improvement. Quick learner. Excellent team player.
TECHNICAL SKILLS:
Languages: Pig Latin, HiveQL, NoSql, Shell script, Map Reduce (Core Java), Oracle pl/sql
Big Data Tools: SQOOP,Kafka, ZooKeeper, Apache Oozie, HDFS, HUE, SPARK and it's components
TOOLS: Pig, Hive Query Tool, Autosys,Jira
FRAMEWORKS: Hadoop HDFS (Apache, Cloudera)
Databases: Oracle 10g,HIVE, HBASE and Cassandra
Operating Systems: Windows and Unix
Build Tools: Maven
CI Tool: Jenkins
PROFESSIONAL EXPERIENCE:
Confidential, Atlanta, GA
Sr. Hadoop Developer
Responsibilities:
- Developed and executed custom HIVE queries. Used Hadoop scripts for HDFS (Hadoop File System) data loading and manipulation.
- Performed Hive test queries on local sample files and HDFS files.
- Written Hive queries for data analysis to meet the business requirements. Created Hive tables, views and worked on them using Hive QL.
- Create Oozie workflows and super workflows to automate the data loading process
- Used AutoSys extensively as a scheduler to run the jobs on HIVE for data loading and data manipulation
- Assisted in loading large sets of data (Structure, Semi Structured, and Unstructured) to HDFS using sqoop.
- Developed the application on Eclipse.
- Ingested data from various RDBMS databases using tools like Sqoop onto HDFS
- Extensively used Pig for data cleaning and optimization.Developed Hive queries to analyze data and generate results.
- Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
- Analyzed user request patterns and implemented various performance optimization measures including implementing partitions and buckets in HiveQL.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
- Worked with a data from various regions and have complete understanding of PII fields and masking of the same
- Wrote MapReduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS. Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
Environment: Linux, Apache Hadoop Framework, HDFS, Map Reduce V1, HUE, YARN (MR2.X), PIG, HIVE, HBASE, Zoo Keeper, Oozie, SQOOP &Flume,AutoSys,Java.
Confidential, Brookfield, WI
Hadoop Developer
Responsibilities:
- Developed and executed custom MapReduce programs, PigLatin scripts and HQL queries.
- Used Hadoop scripts for HDFS (Hadoop File System) data loading and manipulation.Performed Hive test queries on local sample files and HDFS files.
- Written Hive queries for data analysis to meet the business requirements. Created Hive tables and worked on them using Hive QL.
- Assisted in loading large sets of data (Structure, Semi Structured, and Unstructured) to HDFS.
- Developed the application on Eclipse.
- Extensively used Pig for data cleaning and optimization.Developed Hive queries to analyze data and generate results.
- Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
- Worked in an onsite-offshore environment.
- Responsible for technical design based on data dictionary (Business requirement). Responsible for providing technical solutions and work arounds.
- Migrating the needed data from Data warehouse and Product processors into HDFS using Sqoop and importing various formats of flat files in to HDFS.
- Using Spark Streaming to bring all credit card transactions in the Hadoop environment.
- Implemented partitioning, dynamic partitions, indexing and bucketing in HIVE.
- Analyzed user request patterns and implemented various performance optimization measures including implementing partitions and buckets in HiveQL.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
- Wrote MapReduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS. Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
Environment: Java, Linux, Apache Hadoop Framework, HDFS, Map Reduce V1, YARN (MR2.X), PIG, HIVE, HBASE, Zoo Keeper, Oozie, SQOOP & Flume.
Confidential, San Ramon, CA
Hadoop Developer
Responsibilities:
- Coordinated with business customers to gather business requirements. Importing and exporting data into HDFS from database and vice versa using SQOOP.
- Responsible to manage data coming from different sources.
- Worked on analyzingHadoop cluster and different Big Data analytic tools including Pig, HBase database and SQOOP.
- Load and transform large sets of structured and semi structured data.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Analyzed data using Hadoop components Hive and Pig.
- Involved in running Hadoop streaming jobs to process terabytes of data. Gained experience in managing and reviewing Hadoop log files.
- Involved in writing Hive queries for data analysis to meet the business requirements.
- Worked on streaming the analyzed data to the existing relational databases using SQOOP for making it available for visualization and report generation by the BI team.
- Involved in creating the workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.
- Developed PIG Latin scripts for the analysis of semi structured data. Imported data using SQOOP to load data from MySQL to HDFS on regular basis.
- Worked hands on with ETL process.
- Prepared Project maintenance, Test summary, Test result, Test case and Go-Live plan documents for project release
Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, HBase and Java Eclipse.
Confidential
Pl/Sql Developer
Responsibilities:
- Understanding the existing functionality of the system Manavata.org
- Analyzing the enhancement requirements and Change Requests related to functionality under social service, online collection of blood donors list and blood seekers modules.
- Design & Coding using Oracle PL/SQL, Java
- Testing & Debugging
- Documentation