We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Denver, CO


  • 9 + years of experience on Big Data Hadoop and Java J2EE Solutions Analysis, Design, Development, Testing and Deployment.
  • Expertise on Big - Data Hadoop stack using Cloudera CDH4, CDH5, EMR, Hortonworks HDP2.2, HDP2.3, MpaR Data Platforms.
  • Experience on Amazon cloud components AWS - EC2, EMR, S3, RDS, Redshift, Data pipeline .
  • Capable of processing large sets of structured, semi-structured and un-structured data and supporting systems applications architecture.
  • Superior background in object oriented development including PERL, C++, Java, Scala and shell scripting.
  • Expertise with data architecture including data ingestion, data cleansing, data modeling, transfromations, data mining and advanced data analytics .
  • Extensive experience in using eco systems components Map Reduce, Hive, HQL, Pig, Sqoop, Oozie, Flume, Mahout, R, Cassendra, DSE, HBase and Apche Spark .
  • In-depth understanding of MR and YARN .
  • Experience in writing Oozie workflows to run multiple jobs using Hive, Shell, Java, Oozie actions.
  • Experience on the SparkCore, SparkSQL, Spark Streaming MLib, Graphx .
  • Expertise on creating RDD’s, Pair RDD’s, Transformations and Actions in Spark.
  • Experience on creating Data Frames and DStreams for Relational data processing in SparkSQL .
  • Experience on integrating Kafka and Spark for Real Time processing.
  • Worked on Kafka Mirror Maker setup, data replication across data centers.
  • Experience with Kafka Topics, Replication and Partitions creation.
  • Experience in installing, configuring and monitoring Hadoop clusters.
  • Responsible to develop LINUX shell scripts for automating and testing jobs.
  • Vast experience on developing Java based web, products and enterprise applications.
  • Experience as ETL developer using Talend Studio for Big Data .
  • In-depth understanding in Talend components for HDFS, S3, HIVE and Java .
  • Able to assess business rules, collaborate with stakeholders and perform design and reviews .
  • Expertise in Functional and Technical designs development and implementations.
  • In-depth understanding of Hadoop infrastructure and Design Patterns .
  • Experience in both Waterfall and Agile development methodology.
  • Extensive experience in reviewing and analysisng requirements documents, understanding business process flow diagrams and use cases .
  • Experience with Hadoop cluster Installation, Configuration, Administration and cluster management with Ambari .
  • Experience in creation of reusable components using multiple technologies.
  • Good understanding of ORC, RC, Sequence File, AVRO and Parquet file formats.
  • Experience on SQL and No-SQL, Realational, Graph database systems.
  • Expertise on Spring MVC Controllers development for data processing.
  • Experience on Core Java, Collections, Servlet’s, JSP’s, IoC, DI, Portlet’s, Liferay, Hibernate, JDBC.
  • Experience on User Stories creation and tracking in Rally, VersionOne for Agile methodology.
  • Experience on domains Retail, Finance, CME and Telecom .
  • Excellent communication and presentation skills


Hadoop Skills: MapReduce, HDFS, Hive, Pig, Sqoop, R, Cassendra, DSE, Spark, Kafka, HBase, Oozie.

Apache Spark: Spark Core, Spark SQL, Spark Streaming, MLib, Graphx.

AWS Cloud: Amazon EMR, S3, Redshift, EC2, RDS Postgre, Data Pipeline.

Hadoop Distributions: Hortonworks, Pivotal, EMR, MapR, Cloudera.

Languages: Java, J2EE, C, C++. Scala

Frame Works: Spring, Hibernate, Struts, Versata.

ETL: Talend ETL, Talend Studio, Talend for Big Data.

Databases: Oracle, MySQL, Teradata, DB2, Hive, AWS RDS.

Tools: Rally, VersionOne, GitHUB, Eclipse, Clear Case, Clear Quest, Jenkins.

Methodologies: Agile, Waterfall, UML, Design Patterns.

Platforms: Windows XP, 7, 8.1, Linux Centos, RHEL.

Servers: WebSphere, Darwin, Helix, Tomcat, Darwin.


Confidential, Denver, CO

Sr. Hadoop Developer


  • Responsible for migrating data from diversified data sources into Hadoop XNet platform.
  • Responsible for creating low level designs for code implementations.
  • Created Sqoop jobs for both Historical and Incremental data migration from legacy systems .
  • Developed re-usable SFTP utility for flat files migration into XNet platform.
  • Developed Kafka producer and consumer components for real time data processing.
  • Created Kafka Topics, Partitions with replication factors across data centers.
  • Implemented Kafka Mirror Maker for data replication across the clusters.
  • Experience on Kafka and Spark integration for real time data processing.
  • Responsible for creating Event Handler Classes for events processing.
  • Created Spark SQL jobs with various tranformations and actions for data processing.
  • Created Spark jobs with RDD’s, Pair RDD’s, Transformations and Actions, Data Frames for data transformations from relational stores.
  • Responsible for creating Hive tables using Partitions, Buckets, UDF’s, HQL Scripts in landing layer for analytics.
  • Developed folder watcher utility for continuous data migration to HDFS.
  • Developed Oozie workflows for running sequential job flows in Production.
  • Created mapping documents from legacy systems to Hadoop .
  • Developed file parsers utilities for parsing files as per the requirement.
  • Responsible for user stories creations, tracking and delivering as per sprint planning.

Environment: Hortonworks HDP-2.3 YARN cluster, HDFS, Hive, Spark SQL, Scala, Spark Streaming, HBase, Kafka, Sqoop, Oozie, Control-M and Cassendra, DSE, Orient DB.

Confidential, Minneapolis, MN

Sr. Hadoop Developer


  • Facilitated insightful daily analysis of both historical and incremental data sets for 100’s of TB’s.
  • Developed MapReduce programs to convert data from filesystems to canonical json files.
  • Created Hive external and internal tables to load data from Landing to Foundation.
  • Developed shell scripts to perform data loads in automated way and perform analysis.
  • Created Hive queries that helped data analysis on customer purchase trends by comparing fresh data with EDW reference data and historical metrics.
  • Generating Scala and Java classes from the respective APIs so that they can be incorporated in the overall application
  • Created partitioned and bucketed tables to organize data and achieved better performance.
  • Responsible for creating Talend workflows to migrate data from Amazon S3 to Hadoop platform.
  • Developed Talend ETL for data transformations, joins, filters, mapping, aggregations before sotring data on Hadoop and exporting data to relational systems.
  • Created Talend workflows for data enrichment and data cleansing using multiple components.
  • Developed custom java requirements in Talend using tJava component.
  • Responsible for data exports to MySQL and Redshift using Talend studio.
  • Implemented RC, ORC, Sequence and AVRO files formats in Hive to achieve better performance.
  • Implemented data compression technics to save space for historical data tables.
  • Developed Sqoop historical jobs for data migration from Teradata and DB2 to Hadoop platform.
  • Designed and developed Oozie workflows for Hadoop jobs execution in sequence.
  • Implemented Shell, Sqoop, Hive and Java actions in Oozie for running multiple jobs in cluster.
  • Implemented Oozie forking mechanism for achieving parallelism in Hadoop.
  • Developed re-usable SFTP framework for data migration from external systems.
  • Created HBase tables for handling updates in Hadoop with data loads.
  • Participated in converting existing Hadoop jobs to spark jobs using SparkCore, Spark SQL.
  • Responsible for landing and exporting multi source data to HDFS using Talend.
  • Writing scala classes to interact with the database.

Environment: Hortonworks HDP-2.2 YARN cluster, Talend Studio, HDFS, Map Reduce, Apache Hive, Apache Pig, Apache Spark, Sqoop, Oozie, EDW, ADW, Control-M, SFTP, HBase.


Hadoop Developer


  • Developed of data migration workflows from Amazon S3 to HDFS using Talend studio.
  • Developed data cleansing and transformation workflows using Talend.
  • Written pig scripts for data processing and KPI implementations in Hadoop.
  • Implemented UDF’s for KPI’s and embedded in Pig scripts.
  • Written Talend workflows for data exports into RDBMS systems and Cloud data bases.
  • Developed MapReduce programs for converting un-structured data into structured data.
  • Performed code reviews, migrations from lower to higher environments.
  • Developed Oozie jobs for executing sequence of job flows.

Environment: Cloudera’s CDH 5.5 Hadoop cluster, EC2, AWS-S3, RDS, Talend studio, HDFS, Map Reduce, Apache Hive, Apache Pig, UDF’s, Oozie, C#, Geo-click, QlikView.


Java Developer


  • Performed requirements analysis as per the client requirements.
  • Developed technical design documents as per functional design documents.
  • Coded Java programs for application pages and modules development.
  • Coded Java programs for creation for financial documents.
  • Developed reporting module for Advantage.
  • Responsible for defects tracking and defect fixing in application.
  • Integrated Adobe reader for PDF reporting module.
  • Developed validation frame work for Admin module for entire application.
  • Designed and developed visualization module for application as per client needs.

Environment: Java, WebSphere, Versata, Clear Case, Clear Quest, HTML..


Java Developer


  • Performed requirements analysis as per the client requirements.
  • Developed code as per technical design documents.
  • Developed functionality for Video streaming using MP4IP tool.
  • Performed analysis on Darwin server and done installation and configuration.
  • Developed code to embide Quicktime streaming plugin for video streaming.

Environment: Java, FFMPEG, MP4IP, Darwin, Spring MVC, JSP, Tomcat, Clear Case, Clear Quest.


Associate Java Developer


  • Executed required R&D as per project requirements.
  • Performed requirement analysis as per client needs.
  • Developed code to embide QR codes in the application.
  • Developed Click to Call and Click to Connect functionality in the application..

Environment: Java, J2EE, Spring MVC, JSP, Hibernate, Tomcat, Clear Case, Clear Quest.

Hire Now