Jr Hadoop Developer Resume

PROFESSIONAL SUMMARY:

One year of experience in the IT industry with the Hadoop ecosystem and good understanding with the Big Data technologies.
Experience on Hadoop environment includes MapReduce,HDFS,HBase,oozie,Pig,Hive,Kafka,Spark.
Good knowledge in spark core,Streaming,Data Frames and SQL,MLib.
In depth knowledge of Hadoop Architecture and its various componets such as job tracker,Task Tracker,Name Node,Data Node,Resource Manager and Map reduce componets.This componets are also called as the Five Demons in Hadoop Architecture.
Worked on Hive for ETL Transformations and optimized Hive queries.
Developed pig latin scripts.
Used flume and Sqoop to channel data from different sources of HDFS and RDBMS.
Job workflow scheduling and monitoring using tools like Oozie
Used spark streaming for the real time analysis of data coming constantly.
Using kafka and integrating with the Spark Streaming
Worked with relational database systems (RDBMS) such as Mysql,Oracle and database systems like HBase.
Good knowledge on Hadoop cluster architecture.
Good knowledge on Data Modelling and Data Mining to model the data as per business requirements.
Worked on programming in java and scala.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop,MapReduce,YARN,pig,Hive,Sqoop,Impala,Oozie,Spark and Kafka

Hadoop Distributions: Cloudera(CDH3) and Map Reduce.

Languages: Java,Scala,SQL,HTML and C/C++

DB languages: My SQL

Build tools: Eclipse,Maven and ETL

Operating systems: LINUX, Mac os and Windows.

PROFESSIONAL EXPERIENCE:

Jr Hadoop Developer

Confidential

Responsibilities:

Handled importing of data from various sources, performed transformations using Pig and loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
Experienced in implementing different kind of joins to integrate data from different data sets like Map and reduce side join.
Involved in loading data from edge node to HDFS using shell scripting
Involved in loading data from UNIX file system to HDFS.
Managing and scheduling Jobs on a Hadoop Cloudera cluster using Oozie work flows and java schedulers.
Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map - Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts.
Used Impala to read, write and query the data in HDFS
Developing predictive analytic using Apache Spark Scala API’s. Knowledge in Spark Core, Streaming, Data Frames and SQL, MLib, GraphX.
Used Spark stream processing get data into in-memory, implemented RDD transformations, actions to process as units.
Created and worked Sqoop jobs with incremental load to populate Hive External tables.
Automated the data processing with Oozie to automate data loading into the Hadoop Distributed File System.
Involved in agile methodologies, daily scrum meetings, planning's.