We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Latham, NY


  • Over 7 years of extensive IT experience in all phases of software development life cycle (SDLC) including 4 years of hands on experience working with Hadoop, HDFS, Map Reduce frae work and Hadoop ecosystem like Hive, Hue, Pig, Sqoop, HBase, Zookeeper, OOzie, Kafka and Apache spark.
  • Excellent Programming skills at a higher level of abstraction using Scala and Spark.
  • Thorough knowledge with the data extraction, transformation and load in Hive, Pig and HBase.
  • Managing and scheduling batch Jobs on a Hadoop Cluster using Oozie.
  • Experienced using Sqoop to import data into HDFS from RDBMS and vice - versa.
  • Having Good knowledge on Single and Multi-Node Cluster configurations.
  • Managing and scheduling batch Jobs on a Hadoop Cluster using Oozie.
  • Created Hive tables to store data into HDFS and processed data using HiveQL.
  • Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node and MapReduce programming paradigm.
  • Hands on experience in installing, configuring, and using Hadoop components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper and Flume.
  • Extensive experience in writing Map Reduce, Hive, PIG Scripting and HDFS.
  • Good understanding and knowledge of NOSQL databases like MongoDB, HBase, and Cassandra.
  • Worked on Importing and exporting data into HDFS and Hive using Sqoop.
  • Wrote Hive queries for data analysis to meet the requirements
  • Strong working knowledge and ability to debug complex problems.
  • Basic knowledge of Linux, Unix and well versed in Core JAVA.
  • Worked with Apache Spark which provides fast and general engine for large data processing.
  • Integrated with functional programming language Scala.
  • Extending HIVE and PIG core functionality by using custom User Defined Function (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig.
  • Acquaintance in requirement extraction, analysis and design document preparation.
  • Excellent oral and written communication skills.
  • Collaborated well across technology groups.
  • Major strengths include the ability to learn new technologies quickly and adapt to new environments.
  • Have a good experience working in agile development environment including Scrum methodology.


Programming Languages: SQL, Java, J2EE, Scala and Unix shell scripting

Big Data Ecosystem: Hadoop, HDFS, MapReduce, Hive, Pig, HBase, Impala, Hue, Sqoop, Kafka, Oozie, Flume, Zookeeper, Spark, Cloudera and Hortonworks.

Databases & NoSQL: Oracle, Teradata, MySQL, SQL Server, DB2, Familiar with NoSQL- HBase

Scripting&Query Languages: UNIX Shell scripting, SQL and PL/SQL.

Hadoop Paradigms: Map Reduce, YARN, In-memory computing, High Availability, Real-time Streaming.

Other Tools: Eclipse, IntelliJ, SVN, GitHub, Jira, BitBucket.

Methodology: Agile, waterfall


Confidential, Latham, NY

Hadoop Developer


  • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data.
  • Experience in using Pig, Hive, Sqoop, HBase, Flume,Impala.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Experience in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
  • Familiar with Java virtual machine (JVM) and multi-threaded processing.
  • Worked on NoSQL databases on HBase.
  • Knowledge in job workflow scheduling and monitoring tools like oozie and Zookeeper
  • Experience in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, MySQL, Spark, Cassandra, Pig, Netezza, Sqoop, Oozie, Version one, Shell, Map Reduce.

Confidential, Hartford, CT

Hadoop Developer


  • Coordinated with the BA team for finalization of requirements.
  • Responsible for generating actionable insights from complex data to drive real business results for various application teams.
  • Provided solutions by writing scripts in PIG Latin to process HDFS data.
  • Effectively used Sqoop to migrate data from RDBMS to HDFS.
  • Effectively used SerDe to load data in Hive table.
  • Developed Spark code by using Scala and Spark-SQL for faster processing and testing.
  • Experience in retrieving data from databases like MYSQL and DB2 into HDFS using Sqoop and ingesting them into HBase.
  • Worked with Apache Spark which provides fast and general engine for large data processing integrated with functional programming language Scala.
  • Experience data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka.
  • Developed Hive Queries to analyze the data in HDFS to identify issues and behavioral patterns.
  • Experienced on loading and transforming of large sets of structured, semi and unstructured data.
  • Monitored Hadoop scripts which take the input from HDFS and load the data into Hive.
  • Effectively used Oozie to develop automatic workflows of Sqoop, Mapreduce and Hive jobs.
  • Post implementation support and various production fixes, discrepancies.
  • Experienced with handling administration activations using Cloudera manager.
  • Conducted various code walkthroughs/Reviews for the modules developed.
  • Developed Shell scripts to automate routine DBA tasks.
  • Extensively involved in Installation and configuration of Cloudera distribution Hadoop, Name Node, Secondary Name Node, Job Tracker, Task Trackers, and Data Nodes.
  • Delivered successfully various projects in this application within stipulated timeline.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, MySQL, Spark, Cassandra, Pig, Netezza, Sqoop, Oozie, Version one, Shell, Map Reduce, SVN.

Confidential, Beaverton, OR

Hadoop Developer


  • Built Hadoop cluster ensuring High availability for NameNode, mixed-workload management, performance optimization, health monitoring, backup and recovery across one or more nodes
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase Database and Sqoop.
  • Experienced in collecting, aggregating, and moving large amounts of streaming data into HDFS using Flume.
  • Load and transform large sets of structured, semi structured and unstructured data
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Interacted with different system groups for analysis of systems.
  • Created tables, views in Teradata, according to the requirements.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
  • Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Implemented test scripts to support test-driven development and continuous integration.
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers/sensors.
  • Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data
  • Managed and reviewed Hadoop Log files
  • Hands on experience in AWS Cloud in various AWS services such as Redshift cluster, Route 53 domain configuration.
  • Responsible for smooth error-free solution and Integration with Hadoop
  • Designed a data warehouse using Hive
  • Used Control-m and oozie scheduling tool to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs
  • Involved in Scrum calls, Grooming and Demo meeting, Very good experience with agile methodology.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, MySQL, Control-M, Ubuntu, Oracle, Spark, Java, Cassandra, Pig, Netezza, Sqoop, Oozie, AWS,Version one, Shell, Map Reduce, SVN.

Confidential, Charlotte, NC

Hadoop Developer


  • Primary responsibilities include building scalable distributed data solutions using Hadoop ecosystem.
  • Used Sqoop to transfer data between databases (Oracle & Teradata) and HDFS and used Flume to stream the log data from servers.
  • Developed MapReduce programs for pre-processing and cleansing the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Experienced in managing and reviewing Hadoop log files.
  • Involved in loading data from UNIX file system to HDFS.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Extensively worked on creating combiners, Partitioning, distributed cache to improve the performance of Map Reduce jobs.
  • Implemented Different analytical algorithms using map reduce programs to apply on top of HDFS data.
  • Used Pig to perform data transformations, event joins, filter and some pre-aggregations before storing the data onto HDFS.
  • Implemented Partitions, Buckets in Hive for optimization.
  • Involved in creating Hive tables, loading structured data and writing hive queries which will run internally in map reduce way.

Environment: Apache Hadoop, Cloudera, Hive, Pig, Sqoop, Zookeeper, HBase, Java, Oozie, Oracle, Teradata, and UNIX Shell Scripting.


Hadoop Developer


  • Worked on Sqoop jobs to import data from Oracle and bring into HDFS.
  • Performace tuning of Spark and Sqoop Job.
  • Developing parser and loader map reduce application to retrieve data from HDFS and store to HBase and Hive.
  • Hands on design and development of an application using Hive (UDF).
  • Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query
  • Provide support data analysts in running Pig and Hive queries.
  • Created partitioned tables in Hive.
  • Worked on Data Modelling for Dimension and Fact tables in Hive Warehouse.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Hadoop 2.7, Spark 1.4.1, Scala 2.10, SBT 0.13, Sqoop 1.4.6, Mapreduce, HDFS, Pig, Hive 0.13, JavaOracle, Windows.


Java/ J2EE Developer


  • Development of Java code to meet specifications and designs an d using best practices.
  • Development of low level component base design documentation (UML).
  • Developed the DAO layer for the application using Spring Hibernate Template support.
  • Implemented Transactions using spring framework.
  • Used Spring MVC and Web Flow to bind web parameters to business logic.
  • Implemented Ant and Maven build tools to build jar and war files and deployed war files to target servers.
  • Maintained relationship between objects using Spring IOC.
  • Extensively written COREJAVA & Multi-Threading code in application
  • Implemented the JMS Topic to receive the input in the form of XML and parsed them through a common XSD.
  • Written JDBC statements, prepared statements, and callable statements in Java, JSPs and Servlets.
  • Followed Scrum approach for the development process.
  • Modified and added database functions, procedures and triggers pertaining to business logic of the application.
  • Used TOAD to check and verify all the database turnaround times and also tested the connections for response times and query round trip behavior.
  • Used ANT Builder to build the code for production line.
  • Used Eclipse IDE for all recoding in Java, Servlets and JSPs.
  • Used JSP Tag Libraries (JSTL) to implement the logic inside the JSPs.
  • Extremely used plain JavaScript and JQuery, JavaScript Library to do the client side validations.
  • Used AJAX to get the data from the server asynchronously by using JSON object.
  • Used JIRA as a bug-reporting tool for updating the bug report.
  • Focus on converting the existing features in the application towards globalization of the application, which is internationalization of the web representation.
  • Have worked on Oracle 10g data base for storing and retrieving the application data.
  • Involved configuring JMS in application developer.
  • Developed MQ JMS Queues for asynchronous messaging and Web Services using SOAP/WSDL.
  • Involved in Web Logic administration like setting up Data Sources, deploying applications.
  • Configured and Deployed the Web Application Achieve (WAR) in Web Logic Application Server.

Environment: Core Java, J2EE, Servlets, JavaScript, XML, Tomcat, Web logic, SOAP, Eclipse, AJAX, SVN, JDBC, Web Services, XSLT, CSS, DOM, HTML, Maven, JSP, ANT, DB2, JUnit, Oracle.

Hire Now