We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00 Rating


  • Around 5 years of professional IT experience wif hands - on experience in development of Big Data Technologies, data analytics.
  • Experienced as Hadoop Developer wif good knowledge in MapReduce, YARN, HBASE, CASSANDRA, PIG, HIVE, SQOOP.
  • Extensive work experience in Object Oriented Analysis and Design, Java/J2EE technologies including HTML5, XHTML, DHTML, JavaScript, JSTL, CSS, AJAX and Oracle for developing server side applications and user interfaces.
  • Experience wif distributed systems, large-scale non-relational data stores, NoSQL map-reduce systems, data modeling, database performance tuning, and multi-terabyte data warehouses.
  • Excellent understanding and knowledge of NOSQL databases like HBase and Cassandra.
  • Excellent understanding of Hadoop architecture, Hadoop Distributed File System and API's.
  • Good Exposure on Apache Hadoop Map Reduce programming architecture and API's.
  • Experienced in running MapReduce and Spark jobs over YARN.
  • Experienced in writing custom MapReduce me/O formats and key-value formats.
  • Hands-on Experience in installing, configuring and maintaining teh Hadoop clusters.
  • Expert in working wif Hive data warehouse tool-creating tables, data distribution by implement-ing partitioning and bucketing, writing and optimizing teh HiveQL queries.
  • Familiar in writing MapReduce jobs for processing teh data over Cassandra cluster.
  • Experienced in writing MapReduce jobs over HBase, custom Filters, and Co-processors.
  • Hands on experience in Import/Export of data using Hadoop Data Management tool SQOOP.
  • Used Hive and Pig for performing data analysis.
  • Familiar wif MongoDB concepts and its architecture.
  • Experienced wif moving data from Teradata to HDFS using Teradata connectors.
  • Good experience in all teh phases of Software Development Life Cycle (Analysis of requirements, Design, Development, Verification and Validation, Deployment).
  • Hands-on experience wif "productionalizing" Hadoop applications (e.g. administration, configu-ration management, monitoring, debugging, and performance tuning)
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Experience working wif JAVA J2EE, JDBC, ODBC, JSP, Java Beans, Servlets.
  • Experience wif AJAX, REST and JSON
  • Experience in using IDEs like Eclipse and experience in DBMS like Oracle and MYSQL.
  • Evaluate and propose new tools and technologies to meet teh needs of teh organization.
  • Good knowledge in Unified Modeling Language (UML), Object Oriented Analysis and Design and Agile Methodologies.
  • An excellent team player and self-starter wif TEMPeffective communication skills.


Hadoop/Big Data/NoSql Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, YARN, Zookeeper, HBase

Programming Languages: Java, Python, C, SQL, PL/SQL, Shell Script

IDE Tools: Eclipse, Rational Team Concert, NetBeans

Framework: Hibernate, Spring, Struts, JMS, EJB, JUnit, MRUnit, JAXB

Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP,JSON, XML, XHTML, Rest Web Services

Application Servers: Jboss, Tomcat, Web Logic, Web Sphere

Databases: Oracle 11g/10g/9i, MySQL, DB2, Derby, MS-SQL Server

Operating Systems: LINUX,UNIX, Windows

Build Tools: Jenkins, Maven, ANT

Reporting/BI Tools: Jasper Reports, iReport, Tableau, QlikView


Confidential, Columbus IN


  • Implemented Hadoop framework to capture user navigation across teh application to validate teh user interface and provide analytic feedback/result to teh UI team.
  • Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Performed analysis on teh unused user navigation data by loading into HDFS and writing Map Reduce jobs. Teh analysis provided inputs to teh new APM front-end developers and lucent team.
  • Written spark programs in Scala and ran spark jobs on YARN.
  • Worked wif Cassandra for non-relational data storage and retrieval on enterprise use cases.
  • Wrote Map Reduce jobs using Java API and Pig Latin.
  • Loaded teh data from Teradata to HDFS using Teradata Hadoop connectors.
  • Used Flume to collect, aggregate and store teh web log data onto HDFS.
  • Wrote Pig scripts to run ETL jobs on teh data in HDFS.
  • Used Hive to do analysis on teh data and identify different correlations.
  • Written AdhocHiveQL queries to process data and generate reports.
  • Involved in HDFS maintenance and administering it through Hadoop-Java API.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Worked on HBase. Configured MySQL Database to store Hive metadata.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Written Hive queries for data analysis to meet teh business requirements.
  • Automated all teh jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
  • Involved in creating Hive tables and working on them using Hive QL.
  • Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
  • Maintaining and monitoring clusters. Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers wif regular code review sessions.
  • Weekly meetings wif technical collaborators and active participation in code review sessions wif senior and junior developers.

Environment: Hadoop, Map Reduce, HDFS, Flume, Pig, Hive, Spark, Scala,Yarn,HBase, Sqoop, ZooKeeper, Cloudera, Oozie, Cassandra, NoSQL, ETL, MYSQL, agile, Windows,UNIX Shell Scripting, Teradata.

Confidential, Princeton NJ


  • Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats
  • Developed MapReduce programs that filter bad and un-necessary claim records and find out unique records based on account type
  • Processed semi, unstructured data using Map Reduce programs
  • Implemented Daily Cron jobs that automate parallel tasks of loading teh data into HDFS and pre-processing wif Pig using Oozie co-ordinator jobs
  • Implemented custom DataTypes, InputFormat, RecordReader, OutputFormat, RecordWriter for MapReduce computations
  • Worked on CDH4 cluster on CentOS.
  • Successfully migrated Legacy application to Big Data application using Hive/Pig/HBase in Production level
  • Transformed date related data into application compatible format by developing apache Pig UDFs
  • DevelopedMapReducepipeline for feature extractionand tested teh modules using MRUnit
  • Optimized MapReduce jobs to use HDFS efficiently by using various compression mechanisms
  • Creating Hive tables, loading wif data and writing Hive queries which will run internally in MapReduceway
  • Responsible for performing extensive data validation using Hive
  • Implemented Partitioning, Dynamic Partitions and Bucketing in Hive for efficient data access
  • Worked on different set of tables like External Tables and Managed Tables
  • Used Oozie workflow engine to run multiple Hive and Pig jobs
  • Involved in installing and configuring Hive, Pig, Sqoop, Flume and Oozie on teh Hadoop cluster.
  • Involved in designing and developing non-trivial ETL processes wifin Hadoop using tools likePig, Sqoop, Flume, and Oozie
  • Used DML statements to perform different operations on Hive Tables
  • Developed Hive queries for creating foundation tables from stage data
  • Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations
  • Analyzed teh data by performing Hive queries and running Pig scripts to study customer behavior
  • Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
  • Worked wif Sqoop to export analyzed data from HDFS environment into RDBMS for report generation and visualization purpose
  • Queried and analyzed data from DatastaxCassandrafor quick searching, sorting and grouping
  • Developed Mapping document for reporting tools

Environment: Apache Hadoop, HDFS, MapReduce, Java (jdk1.6), MySQL, DB Visualizer, Linux, Sqoop, Apache Hive, Apache Pig

Confidential, Boston, MA


  • Installed and configured Hadoop clusters for Dev, Qa and production environments
  • Installed and configured teh Hadoop name node ha service using Zookeeper.
  • Installed and configured Hadoop security and access controls using Kerberos, Active Directory
  • Imported data from Sql Server database to hdfs using Sqoop.
  • Creating Hive tables, loading wif data and writing Hive queries which will run internally in MapReduce way.
  • Used Eclipse to develop J2EE Components. Components involved a JSP front end (Light weight). CSS and scripts were also part of front end development
  • Designed J2EE project wif Front Controller pattern
  • Designed CSS and tag libraries for Front end
  • Extensive use of Java scripts and AJAX to control all user functions.
  • Developing front end screens which includes JQuery, JavaScript, Java and CSS
  • Attend business and requirement meetings
  • Using ANT to create build scripts for deployment and run teh JUnit test cases
  • Using VSS extensively to code check-in, check-out and version them and maintain production, test and development views appropriately.
  • Understand teh sources of data and organize it in a structured table setup
  • Deliver daily reports and data sheets to clients for their business meetings.
  • Code review, unit testing and local Integration testing.
  • Integrating of application modules, components and deploying in teh target platform.
  • Involving in teh requirement study, and preparation of detailed software requirement specification.
  • Involving in low level and high level design and preparation of HLD and LLD documents Visio
  • Testing support during integration and production

Environment: Hadoop, Hive, Sqoop, Zookeeper, Mapreduce, WebSphere / DB2/ IBM RAD.JDK1.6, JSP, J2EE, HTML, Javascript, CSS, Servlets, Struts, JDBC, Oracle, SQL, Log4j, JUnit, VSS, Ant, Shell script, Visio.

We'd love your feedback!