We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Hazelwood, MO


  • Overall 7 years of IT experience which includes strong experience in Big data ecosystem and Java / J2EE related technologies.
  • Hands on Experience in Hadoop ecosystem including HDFS, Spark, Hive, Pig, Sqoop, Impala, Oozie, Flume, Kafka, HBase, ZooKeeper, MapReduce
  • Extensively understanding of Hadoop Architecture, workload management, schedulers, scalability and various components such as YARN, Map Reduce
  • Hands on experience on RDD architecture, implementing Spark operations on RDD and optimizing transformations and in Spark
  • Exposure in Spark Streaming, Spark SQL in a production environment
  • Worked in building, configuring, monitoring and supporting Cloudera Hadoop CHD5
  • Good Knowledge on Hadoop Cluster architecture and monitoring the cluster
  • Experience in managing and reviewing Hadoop log files
  • Involved in creating Hive tables, loading with data and writing Hive queries that will run internally in map reduce way
  • Extensive experience in data ingestion technologies, such as Sqoop, Flume, and Kafka
  • Expertise in Java, Scala and scripting languages like Python
  • Deep understanding and knowledge of NOSQL databases like MongoDB HBase Cassandra
  • Implemented in setting up standards and processes for Hadoop based application design and implementation
  • Well versed in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa
  • Experience in Object Oriented Analysis Design OOAD and development of software using UML Methodology good knowledge of J2EE design patterns and Core Java design patterns
  • Expert in managing Hadoop clusters using Cloudera Manager tool
  • Involvement in complete project life cycle design development testing and implementation of Client Server and Web applications
  • Experience in Administering Installation configuration troubleshooting Security Backup Performance Monitoring and Fine - tuning of Linux Red hat
  • Extensive experience working in Oracle DB2 SQL Server and My SQL database Scripting to deploy monitors checks and critical system admin functions automation
  • Hands on experience in application development using Java RDBMS and Linux shell scripting
  • Experience in Java JSP Servlets EJB WebLogic WebSphere Hibernate Spring JBoss JDBC RMI Java Script Ajax jQuery XML and HTML
  • Ability to adapt to evolving technology strong sense of responsibility and accomplishment
  • Conversant with Agile methodology standards and Test - Driven Development


Big Data Skillset - Frameworks & Environments: Cloudera CDHs, Hortonworks HDPs, Hadoop1.0, Hadoop2.0, HDFS, MapReduce, Pig, Hive, Impala, HBase, Data Lake, Cassandra, MongoDB, Sqoop, Oozie, Zookeeper, Flume, Apache Spark, Storm, Kafka, YARN, Falcon, Avro

Amazon Web Services (AWS): Amazon Web Services, Elastic Map Reduce cluster, EC2 Instances, Amazon S3, Amazon Redshift, Amazon Cloud Front

JAVA & J2EE Technologies: Core Java (Java8 & Java FX versions), Hibernate framework, Spring framework, JSP, Servlets, Java Beans, JDBC, Java Sockets & Java Scripts. JavaScript, jQuery, JSF, Prime Faces, XML, Servlets, EJB, JDBC, HTML, XHTML, CSS, SOAP, XSLT and DHTML Messaging Services JMS, MQ Series, MDB, J2EE MVC Frameworks Struts … Struts 2.1, Spring 3.2, MVC, Spring Web Flow, AJAX

IDE Tools: Eclipse, Net Beans, Spring Tool Suite, Hue (Cloudera specific)

Web services & Technologies: XML, HTML, XHTML, HTML5, AJAX, jQuery, CSS, JavaScript, AngularJS, VB Script, WSDL, SOAP, JDBC, ODBC Architectures REST, MVC architecture

Databases & Application Servers: Oracle, MySQL, DB2, Cassandra, HBase, MongoDB, Database Technologies MySQL, Oracle 8i, 9i, 11i & 10g, MS Access, Teradata, Microsoft SQL - Server 2000 and DB2 8.x / 9.x, PostgreSQL

Other Tools: Putty, WinSCP, Data Lake, Talend, Tableau, GitHub, SVN, CVS


Confidential, Hazelwood, MO

Hadoop Developer


  • Experience with Hadoop Ecosystem components like Hbase, Sqoop, ZooKeeper, Oozie, Hive and Pig with Cloudera Hadoop distribution.
  • Developed PIG and Hive UDF’s in java for extended use of PIG and Hive and wrote Pig Scripts for sorting, joining, filtering and grouping the data.
  • Worked with NoSQL databases like Hbase for creating Hbase tables to load large sets of semi structured data coming from various sources
  • Elaborated spark programs using Scala, involved in creating Spark SQL Queries and Developed Oozie workflow for spark jobs
  • Prepared the Oozie workflows with Sqoop actions to migrate the data from relational databases like Oracle, Teradata to HDFS
  • Expand programs in Spark based on the application for faster data processing than standard MapReduce programs.
  • Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using HiveQL
  • Used Sqoop to store the data into Hbase and Hive
  • Enumerated Hive queries to do analysis of the data and to generate the end reports to be used by business users
  • Worked on scalable distributed computing systems, software architecture, data structures and algorithms using Hadoop, Apache Spark and Apache Storm etc. and ingested streaming data into Hadoop using Spark, Storm Framework and Scala.
  • Good experience with NOSQL databases like MongoDB.
  • Responsible in creating producer and consumer API's using Kafka
  • Experienced in handling large datasets using Spark in Memory capabilities, using broadcasts variables in Spark, effective & efficient joins, transformations and other capabilities
  • Elaborated Spark code and Spark - SQL / Streaming for faster testing and processing of data.
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Developed a data pipeline using Kafka, Hbase, Mesos Spark and Hive to ingest, transform and analyzing customer behavioral data

Environment: Hadoop, HDFS, CDH, Pig, Hive, Oozie, ZooKeeper, Hbase, Spark, Storm, Spark SQL, NoSQL, Scala, Kafka, Mesos, MongoDB

Confidential, Atlanta, GA

Hadoop Developer


  • Involved in Discussions with business users to gather the required knowledge
  • Analyzing the requirements to develop the framework
  • Designed and developed architecture for data services ecosystem spanning Relational, NoSQL and Big Data technologies
  • Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop / Big Data concepts
  • Developed Java Spark streaming scripts to load raw files and corresponding processed metadata files into AWS S3 and Elasticsearch cluster.
  • Developed Python Scripts to get the recent S3 keys from Elasticsearch
  • Elaborated Python Scripts to fetch / get S3 files using Boto3 module
  • Implemented PySpark logic to transform and process various formats of data like XLSX, XLS, JSON, TXT
  • Built scripts to load PySpark processed files into Redshift Db and used diverse PySpark logics
  • Developed scripts to monitor and capture state of each file which is being through
  • Designed and Developed Real Time Stream processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning
  • Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources
  • Involved in scheduling Oozie workflow engine to run multiple Hives and pig jobs and used Oozie Operational Services for batch processing and scheduling workflows dynamically
  • Included migration of existing applications and development of new applications using AWS cloud services
  • Wrought with data investigation, discovery and mapping tools to scan every single data record from many sources
  • Implemented Shell script to automate the whole process
  • Extracted data from SQL Server to create automated visualization reports and dashboards on Tableau
  • Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Managing and reviewing data backups & log files

Environment: AWS S3, Java, Maven, Python, Spark, Kafka, Elasticsearch, Amazon Redshift Db, Shell script, PySpark, Pig, Hive, Oozie, JSON

Confidential, Mountain view, CA

Hadoop Developer


  • Developed simple to complex MapReduce jobs using Java language for processing and validating the data.
  • Developed data pipeline using Sqoop, Spark, MapReduce, and Hive to ingest, transform and analyze, customer behavioral data.
  • Exported analyzed data to relational databases using Sqoop for visualization to generate reports for the BI team
  • Implemented Spark using python and Spark SQL for faster processing of data and algorithms for real time analysis in Spark.
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Used the Spark - Cassandra Connector to load data to and from Cassandra. Real time streaming the data using Spark with Kafka.
  • Developing Kafka producers and consumers in java and integrating with apache storm and ingesting data into HDFS and Hbase by implementing the rules in storm
  • Built a prototype for real time analysis using Spark streaming and Kafka
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume
  • Involved in creating Hive tables and working on them using HiveQL and perform data analysis using Hive and Pig.
  • Developed workflow in Oozie to manage and schedule jobs on Hadoop cluster to trigger daily, weekly and monthly batch cycles
  • Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper
  • Expertise in extending Hive and Pig core functionalities by writing custom User Defined Functions (UDF)
  • Used IMPALA to pull the data from Hive tables
  • Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis
  • Create and develop an End to End Data Ingestion on to Hadoop
  • Involved in architecture and design of distributed time - series database platform using NOSQL technologies like Hadoop / Hbase, Zookeeper
  • Integrated NoSQL database like Hbase with Map Reduce to move bulk amount of data into Hbase
  • Efficiently put and fetched data to / from Hbase by writing MapReduce job

Environment: Hadoop, Kafka, Spark, Sqoop, Spark SQL, Spark - Streaming, Hive, Scala, pig, NoSQL, Impala, Oozie, Hbase, Zookeeper


Java Developer


  • Identified System Requirements and Developed System Specifications, responsible for high - level design and development of use cases
  • Involved in designing Database Connections using JDBC.
  • Organized and participated in meetings with clients and team members.
  • Developed web - based Bristow application using J2EE (Spring MVC Framework), POJOs, JSP, JavaScript, HTML, jQuery, Business classes and queries to retrieve data from backend.
  • Development of Client - Side Validation techniques using jQuery.
  • Worked with Bootstrap to develop responsive web pages.
  • Implemented client side and server - side data validations using the JavaScript.
  • Responsible for customizing data model for new applications by using Hibernate ORM technology
  • Involved in the implementation of DAO and DTO using spring with Hibernate ORM.
  • Implemented Hibernate for the ORM layer in transacting with MySQL database.
  • Developed authentication and access control services for the application using Spring LDAP.
  • Experience in event - driven applications using AJAX, Object Oriented JavaScript, JSON and XML. Good knowledge on developing asynchronous applications using jQuery. Valuable experience with Form Validation by Regular Expression, and jQuery Light box.
  • Used MySQL for the EIS layer.
  • Involved in design and Development of UI using HTML, JavaScript and CSS.
  • Designed and developed various data gathering forms using HTML, CSS, JavaScript, JSP and Servlets.
  • Developed user interface modules using JSP, Servlets and MVC framework.
  • Experience in implementing of J2EE standards, MVC2 architecture using Struts Framework.
  • Developed J2EE components on Eclipse IDE.
  • Used JDBC to invoke Stored Procedures and used JDBC for database connectivity to SQL.
  • Deployed the applications on Tomcat Application Server
  • Developed Web services using Restful and JSON.
  • Created Java Beans accessed from JSPs to transfer data across tiers.
  • Database Modification using SQL, PL / SQL, Stored procedures, triggers, Views in Oracle9i.

Environment: Java, JSP, Servlets, JDBC, Eclipse, Web services, Spring 3.0, Hibernate 3.0, MySQL, JSON, Struts, HTML, JavaScript, CSS.


Java Developer


  • Used Eclipse for writing code for JSP, Servlets.
  • Involved in designing the user interface using JSP’s.
  • Developed Application using Core Java Concepts.
  • Used JDBC to invoke Stored Procedures and database connectivity to ORACLE.
  • Used Struts Framework along with JSP, HTML5 to construct the dynamic web pages for the application.
  • Participated in feature team meetings and code review meetings.
  • Responsible for writing SQL queries using MySQL and oracle 10g.
  • Developed various J2EE components like Servlets, JSP, AJAX, SAX, and JMS.
  • Used Spring MVC framework to enable the interactions between JSP / View layer and implemented different DPs.
  • Utilized JSP, HTML5, CSS3, Bootstrap and Angular JS for front–end development.
  • Used JPA and Hibernate annotations for defining object relational metadata.
  • Implemented business layer using Core java, Spring Beans using dependency injection, Spring annotations.
  • Used a Micro service architecture, with Spring Boot–based services interacting and leveraging AWS to build, test and deploy Identity microservices

Environment: Java, J2EE, JSP, JPA, AJAX, SAX, JMS, HTML5, CSS3, Bootstrap, Angular JS, Java script, Hibernate, Spring MVC, Eclipse, Oracle, SQL, MySQL, Spring Beans

Hire Now