Hadoop Developer Resume
4.00/5 (Submit Your Rating)
NY
PROFESSIONAL SUMMARY:
- More than 5 years of experience in IT industry.
- Around 2.5 years of experience in Big Data/Hadoop technologies.
- Hands - on experience on Spark, RDD, Data Frames, Spark SQL and components in Hadoop Ecosystem including Hive, HBase, Pig, Sqoop.
- Involved in importing and exporting the data from RDBMS to HDFS and vice versa using Sqoop.
- Importing and exporting data from Cassandra, AWS to HDFS and vice versa using Spark API and used Spark SQL to Analyze the data.
- Worked on some of the AWS tools like S3, Redshift, EMR.
- Experienced in managing the Hadoop infrastructure with Cloudera Manager.
- Involved in scheduling Oozie Workflow Engine to run multiple Hive and pig jobs.
- Developed Hive queries and Pig scripts to analyze large datasets.
- Hands on design and development of an application using UDFs in Hive.
- Hands on experience in application development using Java, RDBMS and Linux shell scripting.
- Clear understanding on MapReduce, YARN and Spark.
- Good understanding on Messaging systems like Kafka and Dataset API in Spark.
- Working Knowledge on Spark Streaming.
- Ability to understand and capture technical as well as business requirements.
- Excellent interpersonal and communication skills, research-minded, technically competent with problem solving and leadership skills.
- Quick learner, easily adaptable to new environments, good presentation skills and ability to work well in both team and individual environment are few of my many strengths.
TECHNICAL SKILLS:
HADOOP/BIG DATA: Spark, HDFS, Hive, Pig, HBase, Sqoop, MapReduce, Oozie, Kafka
AWS: S3, RDS, EC2, EMR, Redshift
PROGRAMMING LANGUAGES: Java, Scala.
RDBMS: Oracle, MySQL
NOSQL DATABASES: Hbase, Cassandra
OPERATING SYSTEMS: Windows, Linux, Ubuntu.
IDE: IntelliJ IDEA, Eclipse
PROFESSIONAL EXPERIENCE:
Confidential
Hadoop Developer
Responsibilities:
- Worked on Sqoop jobs for ingesting data from MySQL to HDFS and created Hive external tables for querying the data.
- Used Spark Data Frame APIs to ingest Oracle data to S3 and stored in Redshift and wrote a script to get RDBMS data to Redshift.
- Hands on experience in creating RDDs, transformations and Actions while implementing spark applications.
- Developed Scala scripts, UDFs using both Data frames and RDD in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
- Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and RDD's.
- Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
- Involved in loading data into Cassandra NoSQL Database.
- Processed the complex/nested JSON and CSV data using Data Frame API.
- Automatically scaled-up the EMR Instances based on the data and scheduled and executed Spark scripts in EMR Pipes.
- Validated the source and final output data and tested the data using Dataset API instead of RDD.
- Involved in weekly SCRUM's and monthly SPRINT's.
Confidential, NY
Hadoop Developer
Responsibilities:
- Worked on Sqoop jobs for ingesting data from MySQL to HDFS
- Created hive external tables for querying the data.
- Participated in development/implementation of Cloud era Hadoop environment.
- Involved in scheduling Oozie workflow engine to run multiple Hive jobs.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Validated the source and final output data. Debugged & tested the process as per the Business Requirements.
- Loaded the aggregated data onto Oracle from Hadoop environment using Sqoop for reporting.
Confidential
Software Developer
Responsibilities:
- Gathered the Business Requirements from the Business Analysts and Subject Matter Experts(SME).
- Responsible to manage multiple data sources.
- Involved in HDFS maintenance and loading of structured data.
- Involved in pushing data with a delimited format into HDFS.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Developed Scripts and Batch Jobs to schedule various Hadoop Programs.
- Successfully implemented Hive queries for data analysis to meet the business requirements.
- Creating Hive tables and working on them using Hive QL.
- Involved in running Map Reduce jobs for processing millions of records.
- Developed SQL scripts to import and export data to staging tables.
- Weekly meetings with technical collaborators and active participation in code review sessions with senior developers.
- Used JUnit for unit testing.
Confidential
Developer
Responsibilities:
- Interacting with customer/end users.
- Understanding the existing architecture, process and developing code.
- Testing and Bug fixing.
- Environment: Core Java, Oracle and Eclipse.