Hadoop/Spark Developer Resume San Francisco, CA - Hire IT People

SUMMARY:

6 years of professional IT experience which includes 3 years of experience in Big data ecosystem and related technologies and 3 years of experience as Core Java Developer.
Experience in Hadoop Ecosystem providing and implementing solutions for Big Data Applications with good knowledge of Hadoop architecture.
Good understanding of Hadoop architecture and Hands - on experience with Hadoop components such as Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts and HDFS Framework.
Good understanding of MapReduce2 with YARN framework such as Resource Manager, Node Manager, Application Master.
Experience in importing/exporting from RDBMS to HDFS system using Sqoop in different formats and vice versa.
Ingest real-time and near-real time (NRT) streaming data into HDFS using Flume.
Experience in implementing Mappers, Reducers, Combiners and partitioners to deliver the best results for the large datasets.
Experience in analyzing data using HiveQL
Experience in writing Pig Latin scripts, working with grunt shells.
Experience in working with Spark RDD, Spark-SQL, Data Frame using Scala.
Good understanding of different File Formats like Avro, Parquet, Json, ORC, CSV, Sequence File.
Extensively worked with Cloudera Distribution Hadoop 5.x.
Good hands on experience on concepts of Core Java.
Strong work experience on Eclipse, MS SQL 2008 and SQL Queries.
Well-developed skills in testing, debugging and troubleshooting different types of technical issues.

TECHNICAL SKILLS:

Big Data: Hadoop HDFS, MapReduce2, YARN, Hive, Pig, Flume, Scala, Apache Spark Core, Spark SQL, Sqoop, Impala, Oozie.

File Formats: Text, Sequence, JSON, ORC, AVRO, and Parquet

Database: MySQL, SQL Server 2008 R2Tools: SSMS, Maven

Hadoop Distribution: CDH 5.x

IDE: Eclipse

Programming Language: Java, Scala

Operating Systems: Windows 7/8, Centos

PROFESSIONAL EXPERIENCE:

Confidential, San Francisco, CA

Hadoop/Spark Developer

Technologies: MapReduce, HDFS, Sqoop, Flume, LINUX, Pig, Hive, Spark Core, Spark SQL, Oozie, Impala

Responsibilities:

Created Sqoop jobs to import data from Transaction Data mart(oracle) to HDFS, Hive in Text File Format for further processing.
Involved in collecting and aggregating large amounts of logs into HDFS using Flume.
Created Sqoop jobs to Export the analyzed data into relational databases using Sqoop for visualization and to generate reports for the BI team.
MapReduce Programming/optimization in core java on data imported into HDFS.
Designed Hive Internal and External tables in ORC format as per business requirements. implemented Hive partitioning and bucketing to improve query performance.
Implemented techniques for efficient execution of Hive queries like Map Joins, compress map/reduce output, parallel execution of queries.
Developed PIG Scripts for transactions and web logs Cleansing and used HiveQL for Web logs analysis.
Worked on developing customized UDF's in java to extend Hive and Pig functionality.
Using Scala for programming in Spark.
Working on converting some of the existing Spark applications developed in Spark RDD into Spark DataFrames.
Working on converting Hive Queries into spark transformations using Spark SQL.
Having experience on RDD architecture and implementing Spark operations on RDD and also optimizing transformations and actions in Spark.
Working on the spark job optimizations.
Using Oozie Workflow and Coordinator for Scheduling Sqoop, Map Reduce, Pig and Hive actions.
Cloudera Manger was used to Monitor the Jobs which are running on the CDH cluster.
Extensively worked with Cloudera Distribution Hadoop 5.x.

Confidential

Core Java/Hadoop Developer

Technologies: Core Java, SQL Server 2008, Postilion Real-time Framework, Python, unit, XML, Perforce, My SQL, HDFS, Map Reduce, Hive, Sqoop

Responsibilities:

Involved in analysis, design and development of the system components.
Core Java Developer, worked on Eclipse and SSMS to migrate Java and SQL respectively.
Used Junit framework and Test Harness for unit testing of Java Components.
Component development using Postilion Real-time Framework.
Involved with ingesting data received from various relational database providers, on HDFS for analysis and other big data operations.
Wrote Map Reduce jobs to transform large sets of structured, semi-structured and unstructured data.
Importing and exporting data into HDFS from MySQL and vice versa using Sqoop.
Designed internal and external Hive tables to load data to and from external tables.
Worked as a code reviewer to check the design, vulnerability and scalability of the code.
Worked on Automation to reduce manual efforts.
Documented Initial Analysis Documents, Functional Specifications and WBS.
Documented Requirement Traceability Matrix(RTM) for mapping function/code with Test cases.
Mentored team members by assisting with regular knowledge transfer sessions and training new team member on core java.

Confidential

Core Java Developer

Technologies: Core Java, SQL Server 2008

Responsibilities:

Involved in different phases of the project life cycle from requirements gathering to testing.
Involved in developing Data Transformation and Manipulation classes using core java.
Created Junit test cases and created set up manuals and user guides.

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

San Francisco, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship