We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Lincolnshire, IL


  • IT Professional with 8+ years of referable experience in Software Development and Requirement Analysis in Agile work environment with 4+ years of Big Data Ecosystems experience in ingestion, storage, querying, processing and analysis of Big Data.
  • Experience in dealing with Apache Hadoop components like HDFS, MapReduce, Hive, HBase, Pig, Sqoop, Nifi, Oozier, Mahout, Python, Spark, Cassandra, MongoDB,
  • Good understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name node, and MapReduce concepts.
  • Expertise in managing No - SQL DB on large Hadoop distribution Systems such as: Cloudera, Hortonworks HDP, Map M Series etc.
  • Experience in developing Hadoop integration for data ingestion, data mapping and data process capabilities.
  • Hands on experience in installing, configuring and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume.
  • Strong understanding of Data Modeling and experience with Data Cleansing, Data Profiling and Data analysis.
  • Experience in ETL (Data stage) analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of databases.
  • Experience in extracting source data from Sequential files, XML files, Excel files, transforming and loading it into the target data warehouse.
  • Strong experience with Java/J2EE technologies such as Core Java, JDBC, JSP, JSTL, HTML, JavaScript, JSON
  • Good understanding of service oriented architecture (SOA) and web services like XML, XSD, XSDL, and SOAP.
  • Good Knowledge about scalable, secure cloud architecture based on Amazon Web Services (leveraging AWS cloud services: EC2, Cloud Formation, VPC, S3, etc.
  • Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
  • In-depth understanding of Data Structure and Algorithms.
  • Experience in managing and troubleshooting Hadoop related issues.
  • Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
  • Experience in managing Hadoop clusters using Cloudera Manager.
  • Expertise in setting up standards and processes for Hadoop based application design and implementation.


Big Data Platform: Hortonworks (HDP 2.2)/AWS (S3, EMR, EC2)/Cloudera (VDH3)

OLAP Concepts: Data warehousing, Data Mining Concepts

Apache Hadoop: Yarn 2.0 HDFS, HBase, Pig, Hive, Sqoop, Kafka, Zookeeper, Oozie

Real Time Data Streaming: Apex, Malhar, Spark (Scala)

Source Control: GitHub, VSS, TFS

Databases and NoSQL: MS SQL Server 2012, Oracle 11g (PL/SQL) and MySQL 5.6, MongoDB

Development Methodologies: Agile and Waterfall

Development Tool: Eclipse, Toad, Visual Studio

Programming Languages: Java, .NET

Scripting Languages: JavaScript, JSP, Python, XML, HTML and Bash


Confidential - Lincolnshire, IL

Sr. Hadoop Developer

Roles & Responsibilities:

  • Extensively handled importing of data from Relational DBMS into HDFS using Sqoop for analysis and data processing.
  • Responsible for creating Hive tables on top of HDFS and developed Hive Queries to analyze the data.
  • Optimized the data sets by creating Dynamic Partition and Bucketing in Hive.
  • Collected the information from web server and integrated it to HDFS using Flume.
  • Used Pig Latin to analyze datasets and perform transformation according to business requirements.
  • Stored the compressed data in row column oriented binary file format for efficient processing and analysis.
  • Implemented Hive custom UDF’s for comprehensive data analysis.
  • Involved in loading data from local file systems to Hadoop Distributed File System.
  • Developed Spark jobs using PySpark and used Spark SQL for faster processing of data.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Sqoop script, Pig script, Hive queries.
  • Exporting data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Wrote Shell scripts for automating the tasks.
  • Continuous monitoring of the jobs.

Environment: Hadoop, HDFS, MySQL, Sqoop, Flume, Hive, Pig, Oozie, Spark, PySpark, Hue

Confidential - San Jose, CA

Sr. Hadoop Developer

Roles & Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Managing data from various file system to HDFS using UNIX command line utilities.
  • Involved in importing and exporting data between RDBMS and HDFS using Sqoop.
  • Creating Hive tables, loading with data and writing Hive queries.
  • Implemented Partitioning, Dynamic Partition, and Bucketing in Hive for efficient data access.
  • Performed querying of both managed and external tables created by Hive using Impala.
  • Analyzed data using Hadoop Components Hive and Pig.
  • Implemented Hive custom UDF's.
  • Developed Pig scripts for data analysis and perform transformation.
  • Extensively used Spark SQL for processing of data fast.
  • Involved in loading data from local file system to HDFS.
  • Responsible to manage data coming from different sources.
  • Implemented Oozie workflow for Sqoop, Pig and Hive actions.
  • Exported the analyzed data to the relational databases using Sqoop and to generate reports for the Business Analyst and other team.
  • Debugged the results to find if there is any missing at the outcome.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Perform technical problem assessment and resolution tasks.

Environment: Hadoop, HDFS, MySQL, Sqoop, Hive, HiveQL, Pig, Spark, Spark SQL, Oozie, Hue

Confidential - Columbus, OH

Hadoop Developer

Roles & Responsibilities:

  • Responsible for building a system that ingests Terabytes of data per day onto Hadoop from a variety of data sources providing high storage efficiency and optimized layout for analytics.
  • Responsible for converting wide online video and ad impression tracking system, the source of truth for billing, from a legacy stream based architecture to a MapReduce architecture, reducing support effort.
  • Used Cloudera Crunch to develop data pipelines that ingest data from multiple data sources and process them.
  • Used Sqoop to move the data from relational databases to HDFS. Used Flume to move the data from web logs onto HDFS.
  • Used Pig to apply transformations, cleaning and reduplication of data from raw data sources.
  • Used MRUnit for doing unit testing.
  • Involved in managing and reviewing Hadoop log files.
  • Created Adhoc analytical job pipeline using Hive and Hadoop Streaming to compute various metrics and dumped them in HBase for downstream applications.

Environment: JDK1.6, Red Hat Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, Python, Crunch, HBase, MRUnit


Java Developer

Roles & Responsibilities:

  • Involved in designing and implementing the User Interface for the General Information pages and Administrator functionality.
  • Designed front end using JSP and business logic in Servlets.
  • Used Struts Framework for the application based on the MVC-II Architecture and implemented validator Framework.
  • Mapping of the servlet in the Deployment Descriptor (XML).
  • Used HTML, JSP, JSP Tag Libraries, and Struts Tiles to develop presentation tier.
  • Deployed application on JBoss Application Server and also configured database connection pooling.
  • Involved in writing JavaScript functions for front-end validations.
  • Developed stored procedures and Triggers for business rules.
  • Performed unit tests and integration tests of the application.
  • Used CVS as a documentation repository and version controlling tool.

Environment: Java, J2EE, JDBC, Servlets, JSP, Struts, HTML, CSS, Java Script, UML, JBoss Application Server 4.2, MySQL


Java Developer

Roles & Responsibilities:

  • Developed complete Business tire with Session beans.
  • Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
  • Used Web services (SOAP) for transmission of large blocks of XML data over HTTP.
  • Used XSL/XSLT for transforming common XML format into internal XML format.
  • Apache Ant was used for the entire build process.
  • Implemented the database connectivity using JDBC with Oracle 9i database as backend.
  • Designed and developed Application based on the Struts Framework using MVC design pattern.
  • Used CVS for version controlling and JUnit for unit testing.
  • Deployed the application on JBoss Application server.

Environment: EJB2.0, Struts1.1, JSP2.0, Servlets, XML, XSLT, SOAP, JDBC, JavaScript, CVS, Log4J, JUnit, JBoss 2.4.4, Eclipse 2.1.3, Oracle 9i

Hire Now