We provide IT Staff Augmentation Services!

Hadoop Developer Resume

New Jersey, NJ


  • Over 7 years of IT experience in the field of Information Technology that includes analysis, design, development and testing of complex applications.
  • Strong working experience with Big Data and Hadoop Ecosystems including HDFS, PIG, HIVE, HBase, Yarn, Sqoop, Flume, Oozie, Hue, MapReduce and Spark.
  • Hands on experience in installing supporting and managing Hadoop Clusters using Cloudera and Hortonworks distribution of Hadoop.
  • Extensive experience in analyzing data using Hive QL, Pig Latin and MapReduce programs in Java.
  • Extensively implemented POC's on migrating to Spark - Streaming to process the live data.
  • Experienced in Apache Spark for implementing advanced procedures like text analytics and processing using the in-memory computing capabilities written in Scala.
  • Hands on with real time data processing using distributed technologies Storm and Kafka.
  • Used Different Spark Modules like Spark core, Spark RDD's, Spark Dataframe, Spark SQL.
  • Converted Various Hive queries into Spark transformations and Actions that are required.
  • Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including HBase database and Sqoop.
  • Worked on Data Serialization formats for converting Complex objects into sequence bits by using CSS, Avro, Parquet, JSON, CSV.
  • Having good knowledge of Oracle9i, 10g, 11g as Database and excellent in writing the SQL queries and scripts.
  • Experience in implementing Kerberos authentication protocol in Hadoop for data security.
  • Strong command over relational databases: MySQL, Oracle, SQL Server and MS Access.
  • Worked with cloud services like Amazon Web Services (AWS) and involved in ETL, Data Integration and Migration.
  • Experience in setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.


Big Data Technologies: HDFS, Hive, MapReduce, Pig, Sqoop, Oozie, Flume, Kafka, YARN and Spark

Scripting Languages: Shell, Python

Programming Languages: Java, Scala, Python, SQL, C

Hadoop Distributions: Cloudera (CDH4, and CDH5), Hortonworks

NoSQL databases: HBase, Cassandra

Frameworks: Spring, Hibernate

SCM Tools: SVN, GitHub

Web Services: SOAP, REST

Operating systems: UNIX, LINUX, Mac OS and Windows

Web servers: Web logic, Web Sphere, Apache Tomcat

Databases: Oracle, SQL Server, MySQL.


Confidential, New Jersey NJ

Hadoop Developer

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Load the data into spark RDD and performed in-memory data computation to get faster output response.
  • Developed Spark jobs and Hive Jobs to transform data.
  • Developed Spark scripts by writing custom RDDs in Python for data transformations and perform actions on RDDs.
  • Worked on Oozie workflow engine for job scheduling Imported and exported data into MapReduce and Hive using Sqoop.
  • Developed Sqoop scripts to import, export data from relational sources and handled incremental loading on the data by date.
  • Developed Kafka consumer component for Real-Time data processing in Java and Scala.
  • Used Impala to query Hive tables for faster query response times.
  • Experience in importing the real-time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
  • Created Partitioned and Bucketed Hive tables in Parquet and Avro File Formats with Snappy compression and then loaded data.
  • Written Hive queries using spark SQL that integrates with spark environment.
  • Developed MapReduce programs to parse the raw JSON data and store the refined data in tables
  • Used Kafka to load data in to HDFS and move data in to HBase.
  • Captured the data logs from web server into HDFS using Flume for analysis.
  • Worked on moving some of the data pipelines from CDH cluster to run on AWS.
  • Involved in moving data from HDFS to AWS Simple Storage Service (S3) and extensively worked with S3 bucket in AWS.
  • Developed spark application for filtering Json source data in AWS S3 location and store it into HDFS with partitions and used spark to extract schema of Json files.
  • Responsible for migrating the code base from Cloudera Platform to Amazon EMR and evaluated Amazon eco systems components like Redshift.

Environment: Linux, Hadoop 2, Python, Scala, CDH 5.12.1, SQL, Sqoop, HBase, Hive, Spark, Oozie, Cloudera Manager, Oracle, Windows, Yarn, Spring, Sentry, AWS, S3, SQL.

Confidential,Richardson, TX

Hadoop Developer


  • Involved in the Complete Software development life cycle (SDLC) to develop the application.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce.
  • Loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Generated Java APIs for retrieval and analysis on No-SQL Cassandra database.
  • Helped with the sizing and performance tuning of the Cassandra cluster.
  • Developed Hive queries to process the data and generate the results in a tabular format.
  • Handled importing of data from multiple data sources using Sqoop, performed transformations using Hive, MapReduce and loaded data into HDFS.
  • Worked on extracting data from CSV, JSON Files and stored them in Avro and parquet formats.
  • Implemented Partition, bucketing concepts in Hive and designed both Managed and External tables in Hive.
  • Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement in project.
  • Load and transform large sets of structured, semi structured using Hive.
  • Involved in importing the real-time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
  • Worked on Creating Kafka topics, partitions, writing custom partitioned classes.
  • Worked in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the data.
  • Monitoring and controlling local file system disk space usage, log files, cleaning log files with automated scripts.
  • Involved in writing OOZIE jobs for workflow automation.
  • Involved in collecting metrics for Hadoop clusters using Ganglia and Ambari.

Environment: Unix, Linux, Hortonworks 2.6.2, Scala, HDFS, Map Reduce, Hive, Flume, Sqoop, Ganglia, Ambari, Oracle 11g, Ranger, Python, Apache Hadoop, Cassandra.


Hadoop Developer/Admin

  • Implemented authentication and authorization service using Kerberos authentication protocol.
  • Worked with different teams to install operating system, Hadoop updates, patches, version upgrades of Cloudera as required.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig and Sqoop, Hive, Spark and Zookeeper.
  • Developed data pipeline using Flume, Pig and Java map reduce to ingest claim data into HDFS for analysis.
  • Experience in analyzing log files for Hadoop and ecosystem services and finding root cause.
  • Experience monitoring and troubleshooting issues with hosts in the cluster regarding memory, CPU, OS, storage and network
  • Configured Hive meta store with MySQL, which stores the metadata for Hive tables.
  • Experience in scheduling the jobs through Oozie.
  • Performed HDFS cluster support and maintenance tasks like adding and removing nodes without any effect to running nodes and data.
  • Involved in Hadoop Cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster Monitoring.
  • Worked with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig.
  • Involved in Setup and benchmark of Hadoop HBase clusters for internal use.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
  • Exported the result set from HIVE to MySQL using Shell scripts.
  • Actively involved in code review and bug fixing for improving the performance.
  • Successful in creating and implementing complex code changes .

Environment: Hadoop, Cloudera 5.4, Java, HDFS, MapReduce, Pig, Hive, Impala, Sqoop, Flume, Kafka, Kerberos, Sentry, Oozie, HBase, SQL, Spring, Linux, Eclipse.


Java Developer

  • Involved in all the phases of SDLC including Requirements Collection, Design and Analysis of the Customer Specifications, Development and Customization of the application.
  • Designed and Implemented MVC architecture using Spring MVC.
  • Developed administrative interfaces with the technologies of JSP, JavaScript, spring and Hibernate.
  • Used Eclipse as an IDE for developing the application.
  • Primarily focused on the spring components such as Dispatcher Servlets, Controllers, Model and View Objects, View Resolver.
  • Implemented Multithread concepts in Java classes to avoid deadlocking.
  • Implemented Java design patterns like Singleton, Factory, Command patterns.
  • Developed test cases and performed as a unit test using JUnit Framework.
  • Used REST and SOAP UI for testing web service for server-side changes.
  • Designed and developed Web Services to provide services to the various clients using SOAP and WSDL.
  • Responsible for development of configuration, mapping and java beans for Persistent.
  • Involved in Production Support. Solved many production issues based on priority.
  • Developed the User Interface Screens for presentation using JSP, JSTL tags, HTML and CSS.
  • Created automated test cases for the web application using Selenium web driver.
  • Used JIRA as a defect tracking system for all projects, and GitHub as a code repository to manage project code.

Environment: Java 1.5, EJB 2.0, Springs, Struts, JSP, JSTL, Hibernate, Web Services (SOAP, WSDL), XML, Web logic 10.3, Ant 1.6, JUnit, Oracle 11g.


Java Developer

  • Involved in Analysis, Design, Development, Integration and Testing of application modules and followed agile methodology.
  • Involved in developing UML diagrams like Use-case, Class diagrams and Activity diagrams.
  • Developed presentation layer using JSP, Struts tag libraries, JSTL, HTML, JavaScript, CSS.
  • Designed and developed java components using design patterns like Singleton, Strategy and Decorator and used J2EE patterns like Facade and Service Locator.
  • Developed core Java programs for all business rules and workflows using spring framework.
  • Worked with TOAD for Data Modeling design in Oracle11g database creating schemas and tabled for applications.
  • Used AJAX to get the data from the server asynchronously by using JSON object.
  • Involved in transforming XML data in to Java Objects using a JAXB binding tool.
  • Developed various Action classes and Form bean classes using Struts framework.
  • Used JDBC API for Connection with Oracle11g database.
  • Developed the Test Cases and Test Suits to Test the application using Junit.
  • Worked on Eclipse3.1 IDE in developing and debugging the application.
  • Involved in SDLC using methodologies like Waterfall.
  • Application deployed in Linux and Solaris servers using WebLogic on Red Hat Enterprise Linux 5.0.

Environment: Java, J2EE, Servlets, Ajax, JSP, JDBC, Java JMS, JUnit, Oracle, Eclipse, ClearCase, XML, JavaScript, CSS style sheets, Spring, Log4j, Solaris Unix, Weblogic11g, PL/SQL, Ant.


Java Developer

  • Performed Requirement Gathering & Analysis by actively soliciting, analyzing and negotiating customer requirements and prepared the requirements specification document for the application using Microsoft Word.
  • Developed Use Case diagrams, business flow diagrams, Activity/State diagrams.
  • Developed presentation layer using Java Server Faces (JSF) MVC framework.
  • Used JSP, HTML and CSS, JQuery as view components in MVC.
  • Developed custom controllers for handling the requests using the spring MVC controllers.
  • Used JDBC to invoke Stored Procedures and used JDBC for database connectivity to SQL.
  • Deployed the applications on weblogic Application Server.
  • Developed Web services using Restful and JSON.
  • Created and managed microservices using Spring Boot that create, update, delete and get the data.
  • Used Oracle database for tables creation and involved in writing SQL queries using Joins and Stored Procedures.
  • Developed JUnit Test Cases for Code unit test.
  • Worked with configuration management groups for providing various deployment environments set up including System Integration testing, Quality Control testing etc.

Environment: Java/J2EE, SQL, Oracle, JSP 2.0, JSON, Java Script, Web Logic 10.0, HTML, JDBC, Spring, Hibernate, XML, JMS, log4j, JUnit, Servlets, MVC, Eclipse.

Hire Now