Hadoop Developer Resume NY - Hire IT People

PROFESSIONAL SUMMARY:

More than 5 years of experience in IT industry.
Around 2.5 years of experience in Big Data/Hadoop technologies.
Hands - on experience on Spark, RDD, Data Frames, Spark SQL and components in Hadoop Ecosystem including Hive, HBase, Pig, Sqoop.
Involved in importing and exporting the data from RDBMS to HDFS and vice versa using Sqoop.
Importing and exporting data from Cassandra, AWS to HDFS and vice versa using Spark API and used Spark SQL to Analyze the data.
Worked on some of the AWS tools like S3, Redshift, EMR.
Experienced in managing the Hadoop infrastructure with Cloudera Manager.
Involved in scheduling Oozie Workflow Engine to run multiple Hive and pig jobs.
Developed Hive queries and Pig scripts to analyze large datasets.
Hands on design and development of an application using UDFs in Hive.
Hands on experience in application development using Java, RDBMS and Linux shell scripting.
Clear understanding on MapReduce, YARN and Spark.
Good understanding on Messaging systems like Kafka and Dataset API in Spark.
Working Knowledge on Spark Streaming.
Ability to understand and capture technical as well as business requirements.
Excellent interpersonal and communication skills, research-minded, technically competent with problem solving and leadership skills.
Quick learner, easily adaptable to new environments, good presentation skills and ability to work well in both team and individual environment are few of my many strengths.

TECHNICAL SKILLS:

HADOOP/BIG DATA: Spark, HDFS, Hive, Pig, HBase, Sqoop, MapReduce, Oozie, Kafka

AWS: S3, RDS, EC2, EMR, Redshift

PROGRAMMING LANGUAGES: Java, Scala.

RDBMS: Oracle, MySQL

NOSQL DATABASES: Hbase, Cassandra

OPERATING SYSTEMS: Windows, Linux, Ubuntu.

IDE: IntelliJ IDEA, Eclipse

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Developer

Responsibilities:

Worked on Sqoop jobs for ingesting data from MySQL to HDFS and created Hive external tables for querying the data.
Used Spark Data Frame APIs to ingest Oracle data to S3 and stored in Redshift and wrote a script to get RDBMS data to Redshift.
Hands on experience in creating RDDs, transformations and Actions while implementing spark applications.
Developed Scala scripts, UDFs using both Data frames and RDD in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and RDD's.
Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself.
Involved in loading data into Cassandra NoSQL Database.
Processed the complex/nested JSON and CSV data using Data Frame API.
Automatically scaled-up the EMR Instances based on the data and scheduled and executed Spark scripts in EMR Pipes.
Validated the source and final output data and tested the data using Dataset API instead of RDD.
Involved in weekly SCRUM's and monthly SPRINT's.

Confidential, NY

Hadoop Developer

Responsibilities:

Worked on Sqoop jobs for ingesting data from MySQL to HDFS
Created hive external tables for querying the data.
Participated in development/implementation of Cloud era Hadoop environment.
Involved in scheduling Oozie workflow engine to run multiple Hive jobs.
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
Validated the source and final output data. Debugged & tested the process as per the Business Requirements.
Loaded the aggregated data onto Oracle from Hadoop environment using Sqoop for reporting.

Confidential

Software Developer

Responsibilities:

Gathered the Business Requirements from the Business Analysts and Subject Matter Experts(SME).
Responsible to manage multiple data sources.
Involved in HDFS maintenance and loading of structured data.
Involved in pushing data with a delimited format into HDFS.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Developed Scripts and Batch Jobs to schedule various Hadoop Programs.
Successfully implemented Hive queries for data analysis to meet the business requirements.
Creating Hive tables and working on them using Hive QL.
Involved in running Map Reduce jobs for processing millions of records.
Developed SQL scripts to import and export data to staging tables.
Weekly meetings with technical collaborators and active participation in code review sessions with senior developers.
Used JUnit for unit testing.

Confidential

Developer

Responsibilities: