We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)



  • Overall 8 years of professional experience building best - in-class technical solutions for numerous companies including those in Banking, Finance, and Retail sectors using Hadoop Ecosystem tools and Java/J2EE technologies.
  • Having 3 years of experience in End-to-end in Big Data implementation with strong experience on major components of Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, HBase, Zookeeper, Sqoop, Oozie, Flume, Spark and Storm.
  • Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
  • Experience in installation, configuration, supporting and managing Hadoop Clusters using Apache Cloudera (CDH3, CDH4) distributions and on amazon web services (AWS)
  • Good understanding ofMPP databases such as Teradata, Greenplum and Netezza.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Map Reduce, Hadoop GEN2 Federation, High Availability and YARN architecture and good understanding of workload management, schedulers, scalability and distributed platform architectures
  • Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing.
  • Experienced in performing analytics on structured data using Hive queries, operations, Joins, tuning queries, SerDe's and UDF.
  • Strong experience on Hadoop distributions like Cloudera and MapReduce.
  • Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs
  • Good understanding of NoSQL Data bases and hands on work experience in writing applications on No SQL databases like Cassandra, HBase.
  • Strong experience and knowledge of real time data analytics using Flume and Spark.
  • Experience in working with Spark tools like RDD transformations, spark mlib and spark QL.
  • Experienced in handling semi/unstructured data using complex map-reduce programs.
  • Written multiple MapReduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.
  • Extensive experience in working with structured data using Hive QL, optimization queries, in corporate complex UDF's in business logic.
  • Experienced in migrating ETL transformations using Pig Latin Scripts, transformations, join operations.
  • Extensively developed Simple to complex Map/reduce streaming jobs that are implemented using Hive and Pig and Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Contributed as a POC for implementing applications on Spark frameworks, which provides high level APIs like Scala, Java, and python.
  • Experienced in working with streaming data using Flume sources, interceptors.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Have worked on BI tool Tableau developed reports on it.
  • Extensive experience in working with SOA based architectures using Rest based web services using JAX-RS and SOAP based web services using JAX-WS.
  • Hands on experience in designing and coding web applications using Core Java and J2EE technologies like spring, Hibernate, JMS, And Angular JS.
  • Good experience in designing, developing database to create its Objects like Tables, Stored Procedures, Triggers and Cursors using PL/SQL
  • Experience in developing solutions to analyze large data sets efficiently
  • Experience in working with web development technologies such as HTML, CSS, JavaScript and JQuery.
  • Extensive experienced in build/deploy multi module projects using Ant, Maven and CI servers like Jenkins.
  • Extensive experienced in working with UNIX, Linux and writing shell scripts.
  • Hands on Experience in using IDE tools like Eclipse, My Eclipse and NetBeans.
  • Good working experience on Installing and maintaining the Linux servers.
  • Good experienced in working with agile, scrum and Waterfall methodologies.


Big Data Ecosystem: Hadoop, MapReduce, YARN, Pig, Hive, HBase, Flume, Sqoop, Impala, Oozie, Zookeeper, Spark, Ambari, Mahout, MongoDB, Cassandra, Avro, Parquet, Snappy, Kafka.

Hadoop Distributions: Cloudera (CDH3, CDH4, and CDH5), Hortonworks and Map Reduce.

No SQL Databases: Cassandra, MongoDB, HBase, Dynamo DB.

RDBMS: Oracle 9i, MS SQL Server, MySQL, DB2.

Languages: Java, SQL, HTML, DHTML, JavaScript, JDBC, XML, and c/c++.

Java Technologies: JSP, Servlets, JavaBeans, JDBC, JNDI, EJB, struts.

XML Technologies: XML, XSD, DTD, JAXP (SAX, DOM), JAXB.

Web Design Tools: HTML, DHTML, AJAX, JavaScript, JQuery and CSS, AngularJs, ExtJs, JSON.

DB Languages: My SQL, PL/SQL, PostgreSQL, Oracle.

Development / Build Tools: Eclipse, Ant, Maven, JUNIT, log4J, ETL.

Frameworks: Struts, spring, Hibernate.

App/Web servers: WebSphere, WebLogic, JBoss, Tomcat.

Operating systems: UNIX, LINUX, mac os and Windows Variants.


Confidential - KS



  • Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
  • Handled importing of data from various sources, performed transformations using Pig and loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Experienced in implementing different kind of joins to integrate data from different data sets like Map and reduce side join.
  • Involved in loading data from edge node to HDFS usingshellscripting.
  • Implemented Map Reduce programs to handle semi/unstructured data like xml, json, Avro data files and sequence files for log files.
  • Involved in loading data fromUNIXfile system to HDFS.
  • Managing and scheduling Jobs on a Hadoop Cloudera cluster using Oozie work flows and java schedulers.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
  • Experience in developing Pig Latin and HiveQL scripts for Data Analysis andETL purposes and also extended the default functionality by writing User Defined Functions (UDFs) for data specific processing.
  • Design and Develop Pig Latin scripts and Pig command line transformations for data joins and custom processing of Map reduce outputs.
  • Used pig loader for loading tables from Hadoop to various clusters.
  • UsedImpalato read, write and query the data in HDFS.
  • Experienced in migrating HiveQL intoImpalato minimize query response time.
  • Responsible for performing extensive data validation using PIG custom UDF.
  • Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using Hive QL.
  • Extensively involved in writing complex Map Reduce inJavato process bulk data.
  • Experienced with using different kind of compression techniques like Lzo, Snappy, Bzip2, Gzip to save data and optimize data transfer over network using Avro, Parquet, Orcfile.
  • Implemented data injection systems by creating Kafka brokers, Java producers, Consumers, custom encoders.
  • Implemented separate Kafka custom partioners with custom encoders across brokers depending on state.
  • Experience in managing nodes on Hadoop cluster and monitor Hadoop cluster job performance using Cloudera manager.
  • Implemented Spark usingScalaand Spark SQL for faster testing and processing of data.
  • Developing predictive analytic using ApacheSparkScalaAPIs.
  • Knowledge inSparkCore, Streaming, Data Frames and SQL, MLib, GraphX.
  • Used Spark stream processing get data into in-memory, implemented RDD transformations, actions to process as units.
  • Implemented Caching for Spark Transformations, action to use as reusable components.
  • Stored the data in tabular formats using Hive tables and Hive SerDe’s.
  • Experienced with optimizing techniques to get better performance from Hive Queries.
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables.
  • Extracted files fromCassandrathrough Sqoop and placed in HDFS and processed.
  • Extensive experience in writing Pigscripts to transform raw data from several data sources into forming baseline data.
  • Usedmavento build and deploy the Jars for MapReduce, Pig and Hive UDFs.
  • Developed optimal strategies for distributing the web log data over the cluster importing and exporting the stored web log data into HDFS and Hive using Sqoop.
  • Involved in End-to-End implementation of ETL logic.
  • Automated the data processing with Oozie to automate data loading into the Hadoop Distributed File System.
  • Developed the Pig UDF’S to pre-process the data for analysis.
  • SetupJenkins, build jobs to provide continuous automated builds based on polling the subversion source.
  • Involved in agile methodologies, daily scrum meetings, spring planning's.

Environment: Hadoop, Scala, Map Reduce, HDFS, Spark1.4.1(Spark streaming, Spark MLib, Spark GraphX, Spark SQL, Spark Data Frames), AWS, S3, Hive, IntelliJ IDEA, Cassandra, maven, Horton works, Cloudera Manager, Pig, Uinux, Python, PL/SQL.

Confidential - WI



  • Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Wrote the Map Reduce jobs in core java to parse the web logs, which are stored in HDFS.
  • Wrote Junit test cases to test and debug Map Reduce programs in local machine.
  • Involved in loading data from UNIX file system to HDFS using Shell Scripting.
  • Importing and exporting data into HDFS from Oracle 10.2 database and vice versa using SQOOP.
  • Transferred Data from RDBMS to HDFS using SQOOP.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS using Sqoop, perform structural modifications using Map-Reduce, analyze data using Hive and visualizing in dashboards.
  • Integrated Map Reduce with HBase to import bulk amount of data into HBase using Map Reduce Programs.
  • Experienced in converting ETL operations to Hadoop system using Pig Latin operations, transformations and functions.
  • Experienced in running Hadoopstreaming jobs to process terabytes of formatted data using python scripts.
  • Used Flume to collect, aggregate and store the web log data from different sources like web servers and pushed to HDFS.
  • Implemented Partitioning, Dynamic Partitions, buckets in Hive
  • Integrated HadoopSecurity with Active Directory by implementing Kerberos for authentication and Sentry for authorization.
  • Successfully loaded files to Hive and HDFS from HBase
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Performance tuning of Hive Queries.
  • Coordinating the team and resolvingthe issues of the team technically as well as functionally.

Environment: Hadoop, HDFS, Map Reduce, Hive, Flume, Sqoop, CDH, Kafka, Spark, Storm, Apache Crunch, Python, Maven, Linux.

Confidential - ST. Louis, MO



  • Designed and developed J2EE and Web applications to manage and deliver online enrollment for the participants.
  • Application designed and developed using Core Java, J2EE, JMS, JavaScript, Struts, SOA, Hibernate, Spring Batch framework, spring AOP CSS, Web services axis.
  • Implemented GUI screens for viewing using Servlets, JSP, Tag Libraries, JSTL, JavaBeans, HTML, JavaScript and Struts framework using MVC design pattern.
  • Build, configured and deployed Web components on Web Logic application server.
  • Extensively worked with JDBC programs using Oracle and MySQL databases and developed SQL and PL/SQL for Oracle to process the data.
  • Implemented multi-threaded design for delivering good response times and avoiding deadlocks and race conditions. Optimized application system for high-availability and high-performance using Load balancing features of WebSphere.
  • Implemented custom JSP tags for displaying trader data.
  • Developed test cases using Junit for functionality and unit testing.
  • Wrote Java code for accessing trade data from Oracle and DB2 databases using JDBC API and SQL queries and accessed it from J2EE Web component.
  • Ant and Shell scripts were written to automate some processes.
  • Developed Stateless session beans and Data Access Objects.
  • Defined transaction attributes for EJBs for deployment.
  • Developed the Controller Servlet sends the requests to the appropriate Action Classes.
  • Developed Action Servlet for incoming client requests.
  • Used Java script and Struts validation framework for performing front end validations.
  • Used Struts Tags to tie the Struts view components to the rest of the framework.
  • Developed function library using Java Script.
  • Creating the WSDL files for web services to publish the services to another application.

Environment: Java, HTML, Java Script, SQL Server, PL/SQL, JSP, Struts, Spring, Hibernate, Web Services, SOAP, SOA, Servlets, JSF, JDBC, Java JMS, Junit, Oracle, Eclipse, SVN, XML, CSS, Log4j, Maven, Apache Tomcat.

Confidential, New York City, NY



  • Created design documents and reviewed with team in addition to assisting the business analyst / project manager in explanations to line of business.
  • Created Use case, Sequence diagrams, functional specifications and User Interface diagrams.
  • Responsible for understanding the scope of the project and requirement gathering.
  • Involved in analysis, design, construction and testing of the application
  • Developed the web tier using JSP to show account details and summary.
  • Assisted in design and development of Avon M-Commerce application from the scratch using HTTP, XML, Java, Oracle objects, Toad and Eclipse.
  • Used Tomcat web server for development purpose.
  • Involved in creation of Test Cases for JUnit Testing.
  • Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
  • Developed user interfaces using JSP, HTML, XML and JavaScript.
  • Created Stored Procedures & Functions.
  • Actively involved in code review and bug fixing for improving the performance.

Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, Junit, Apache Tomcat, My SQL Server 2008.




  • Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML.
  • Involved in complete requirement analysis, design, coding and testing phases of the project.
  • Participated in JAD meetings to gather the requirements and understand the End Users System.
  • Developed the front-end user interface using J2EE, Servlets, JDBC, HTML, DHTML, CSS, XML, XSL, XSLT and JavaScript as per Use Case Specification.
  • Generated XML Schemas and used XML Beans to parse XML files.
  • Created Stored Procedures & Functions. Used JDBC to process database calls for DB2/AS400 and SQL Server databases.
  • Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
  • Responsible for Project Documentation, Status Reporting and Presentation.
  • Developed web application called iHUB (integration hub) to initiate all the interface processes using Struts Framework, JSP and HTML.
  • EJB (Session Beans and Entity Beans) on Web sphere Studio Application Developer.
  • Used different Design patterns, like MVC, EJBs Session facade, Controller Servlets, while implementing the Framework.
  • Front End was built using JSPs, JQuery, JavaScript and HTML.
  • Built Prototypes for internationalization
  • Wrote Stored Procedures in DB2
  • Designed and developed user interfaces using JSP, Java script and HTML.
  • Built Custom Tags for JSPs and built the report module on reports based from Crystal reports.
  • Integrating data from multiple data sources and Generating schema difference reports for database using toad.

Environment: Java 1.3, Servlets, JSPs, Java Mail API, Java Script, HTML, J2EE, HTML, Struts, DB2 Spring Batch XML Processing, MySQL 2.1, Swing, Java Web Server 2.0, JBoss 2.0, RMI, Rational Rose, Red Hat Linux 7.1.

We'd love your feedback!