We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

New York, NY


  • Over 7.5+ years of professional experience in IT industry, with 3.5+ years of hands - on expertise in Big Data processing using Hadoop, Hadoop Ecosystem implementation, maintenance, ETL and Big Data analysis operations.
  • Excellent understanding of Hadoop architecture and underlying frameworks including distributed storage (HDFS), resource management (YARN) and MapReduce programming paradigm.
  • Highly experienced in using various Hadoop Ecosystem projects such as MapReduce, Pig, Hive, ZooKeeper, HBase, Sqoop, Flume and Oozie for ingestion, storage, querying, processing and analysis of Big Data.
  • Hands-on expertise in writing MapReduce jobs using Java.
  • Excellent understanding of Hadoop architecture and underlying framework including storage management.
  • Knowledge of architecture and functionality of NOSQL DB like HBase, Cassandra and MongoDB.
  • Experience in managing Hadoop clusters and services using Cloudera Manager.
  • Collected logs data from various sources and integrated in to HDFS using Flume.
  • Knowledge of project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Experience in analyzing data using Pig and Hive and also developing custom User Defined Functions to incorporate methods and functionality of Java into PigLatin and HiveQL.
  • Experience in using ZooKeeper distributed coordination service for High-Availability.
  • Good knowledge of using Oozie Workflow Scheduler system for MapReduce jobs.
  • Good understanding of the architecture and functionality of NoSQL database (HBase).
  • Experience in importing and exporting data between HDFS and RDBMS using Sqoop.
  • Proficient knowledge of collecting logs data from various sources and integrating into HDFS using Flume.
  • Extensive experience in Hadoop Administration activities such as installation, configuration, maintenance and deployment of Big Data components and the underlying infrastructure of Hadoop cluster.
  • Experience in Hadoop cluster planning, monitoring and troubleshooting.
  • Hands-on expertise on multi node cluster setup using Apache Hadoop and Cloudera Hadoop distributions.
  • Experience in managing and monitoring of Hadoop cluster and services using Cloudera Manager.
  • Experience in optimizing the configurations of MapReduce, pig and hive jobs for better performance.
  • Red Hat certified System Administrator on Red Hat Enterprise Linux 7.
  • Good working knowledge in Puppet for configuration management.
  • Experience in working with Java/J2EE environment as a developer.
  • Well versed in Object Oriented Programming and Software Development Life Cycle.
  • In-depth understanding of Data Structures and Algorithms and Optimization.
  • Extensive experience with Databases such as Oracle 10g/11g, Microsoft SQL Server, MySQL.
  • Experience in writing complex SQL queries, PL/SQL Procedures, Functions and Triggers.


Hadoop and Big Data Technologies:   Apache Hadoop, Cloudera CDH4/CDH5, HDFS, MapReduce, YARN, Pig, Hive, Sqoop, Flume, ZooKeeper, Oozie, HBase, Cloudera Manager, Hue

Databases Oracle 10g/11g, Microsoft SQL Server, MySQL, PostgreSQL, MS Access

Programming Languages  C, C++, Java

Scripting & Query Languages Bash Shell Scripting, SQL, PL/SQL

Java/Web Technologies J2EE, Spring, Hibernate, JSP, Servlets, JMS, HTML, CSS, XML, Javascript, AJAX

Operating Systems Linux(Red Hat, CentOS, Ubuntu ), Windows

Linux Servers DNS, DHCP, FTP, NFS, NIS, LDAP, SAMBA, SQUID, NTP, Apache Webserver

Storage Management Disk Partitioning, SWAP Partitioning, iSCSI Storage, LVM, Disk Quota, RAID

Securities Firewalld, IPTables, Kerberos, SSL/TLS, SELinux, SSH, TCP Wrappers

Virtualizations and Cloud VirtualBox, VMware Workstation, Amazon EC2

Networking TCP/IP, Ethernet Networking, IP Addressing, IP Subnetting

Monitoring/Troubleshooting Nagios core, Ganglia, WhatsUp Gold, Top, Netstat, IPTraf, Tcpdump, nmap, ps

Configuration Management Puppet

Development Tools Eclipse, NetBeans, Oracle SQL Developer, SQL *Plus Instant Client

Other Tools  Microsoft Visio, Crystal Reports, PuTTy


Confidential, New York, NY

Senior Hadoop Developer


  • Worked in a team that built big data analytic solutions using Cloudera Hadoop Distribution.
  • Migrated RDBMS (Oracle, MySQL) data into HDFS and Hive using Sqoop.
  • Extensively used Pig scripts for data cleansing and optimization.
  • Worked on creating and optimizing Hive scripts based on business requirements.
  • Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Involved in loading data from UNIX file system to HDFS.
  • Wrote MapReduce jobs to discover trends in data usage by users.
  • Used Map Reduce JUnit for unit testing.
  • Involved in managing and reviewing Hadoop log files.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Loaded and transformed large sets of structured, semi structured and unstructured data.
  • Exported the result set from HIVE to MySQL using Shell scripts.
  • Used Zookeeper for various types of centralized configurations.
  • Involved in maintaining various Unix Shell scripts.
  • Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
  • Created and managed Hive external tables in Hadoop for Data Warehousing.
  • Developed customized UDFs in Java for extending Pig and Hive functionality.
  • Extracted the needed data from the server into HDFS and Bulk-Loaded the cleaned data into HBase.
  • Integrated the Hive Warehouse with HBase.
  • Configured Oozie to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Worked on optimization and performance tuning of MapReduce, Hive & Pig Jobs.

ENVIRONMENT: CDH5, HDFS, MapReduce, Pig, Hive, Sqoop, HBase, Oozie, Hue, Java, Linux

Confidential, New York, NY 

Hadoop Developer


  • Worked on implementation and maintenance of Cloudera Hadoop Cluster.
  • Monitored workload, job performance and node health using Cloudera Manager.
  • Involved in setting up High-availability Hadoop cluster using Quorum Journal Manager (QJM) and ZooKeeper.
  • Experienced in Importing and exporting data into HDFS and Hive using Sqoop.
  • Loaded and transformed large sets of structured, semi structured and unstructured data.
  • Responsible for managing data coming from different sources.
  • Gained good experience with NoSQL database.
  • Involved in creating Hive tables, loading with data and writing Hive queries, which will run internally in map, reduce way.
  • Proactively monitored and handled systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, disaster recovery systems & procedures and Hadoop log files.
  • Assisted in configuration and maintenance of various Hadoop infrastructures like Pig, Hive, and HBase.
  • Imported data from RDBMS (MySQL, Teradata) to HDFS and vice versa using Sqoop (Big Data ETL tool) for Business Intelligence, visualization and report generation.
  • Used Flume to collect and aggregate web log data from different sources and pushed to HDFS.
  • Developed and executed custom MapReduce programs, PigLatin scripts and HiveQL queries.
  • Developed and optimized Pig and Hive UDFs to implement the methods and functionality of Java as required.
  • Analyzed user request patterns and implemented various performance optimization measures including but not limited to implementing partitions and buckets in HiveQL.

ENVIRONMENT: CDH4, HDFS, YARN, MapReduce, Pig, Hive, Sqoop, Flume, ZooKeeper, Nagios, Java, Linux, Shell Scripting

Confidential, New York, NY 

Big Data Consultant / Hadoop Consultant


  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Collaborated with the systems engineering team to plan and deploy hardware and software environments required for new Hadoop environments and to expand existing Hadoop cluster.
  • Logical Implementation and interaction with HBase.
  • Used Fast script alternative to scoop to automate transfer of data from oracle to HBase.
  • Performed data analysis using Hive and Pig.
  • Loaded log data into HDFS using Flume.
  • Worked with application teams to install Hadoop updates, patches, version upgrades as required
  • Installed and configured various components of Hadoop Ecosystem and maintained their integrity.
  • Responsible for cluster maintenance, commissioning and decommissioning of DataNodes, cluster monitoring, troubleshooting, managing of data backups and disaster recovery systems, analyzing Hadoop log files.
  • Configured Hive using shared meta-store in MySQL and used Sqoop to migrate data into External Hive Tables from different RDBMS sources (Oracle, Teradata and DB2) for Data warehousing .
  • Developed Hive queries and custom UDFs to bring data into a structured format.
  • Installed and implemented monitoring tools like Ganglia and Nagios to monitor Hadoop cluster environment.

ENVIRONMENT: Hadoop 1x, HDFS, MapReduce, Pig, Hive, Sqoop, Nagios, Ganglia, MySQL, Linux, NFS, Shell scripting

Confidential, Whitehouse Station, NJ

Java/J2EE Developer - Consultant


  • The entire application was developed in J2EE using Spring MVC based architecture.
  • Involved in design and interacted with business intelligence team during requirement analysis.
  • Involved in developing Data models and Database schemas.
  • Used Hibernate ORM framework for data persistence layer.
  • Involved in creating the Hibernate POJO Objects and mapped using Hibernate Annotations.
  • Developed user interface using JSP, HTML, XML and CSS to simplify the complexities of the application.
  • Used JMS for asynchronous messaging between different modules.
  • Used WebSphere Application Server to develop and deploy the application.
  • Actively participated in client meetings and taking the inputs for the additional functionality.
  • Actively involved in code review and bug fixing for improving the performance.
  • Involved in coding for JUnit Test cases.

ENVIRONMENT: Java/J2EE, spring, Hibernate, JSP, Servlets, JMS, HTML, XML, CSS, SQL, PL/SQL, Oracle 11g, WebSphere

Confidential, New York, NY

Java/J2EE Developer - Consultant


  • Involved in various phases of development life cycle of the project and participated in daily SCRUM meetings.
  • Participated in JAD meetings to gather the requirements and understand the End-user Systems.
  • Created detailed design documents using UML (Use case and Sequence diagrams).
  • Implemented MVC design pattern using Spring Framework.
  • Implemented the functionality of mapping entities to the database using Hibernate.
  • Used Spring framework to define beans for Entity and corresponding dependent services.
  • Used HTML, CSS, XML, JavaScript and JSP for interactive cross browser functionality and complex user interface.
  • Written queries, stored procedures and functions using SQL, PL/SQL in Oracle.
  • Involved in Bug fixing of various modules that were raised by the testing teams in the application.
  • Re-designed core framework and database access by replacing JDBC and using Spring and Hibernate.
  • Worked on performance tuning at different level starting from database level moving up to application level.

ENVIRONMENT: Java/J2EE, spring, Hibernate, JSP, Servlets, HTML, CSS, JavaScript, XML, PL/SQL, Oracle 10g, WebLogic

Confidential, Wayne, NJ 

Oracle Developer - Consultant


  • Designed, developed and maintained Oracle database schemas, tables, standard views, materialized views, synonyms, indexes, constraints, sequences, cursors and other database objects.
  • Wrote Stored Procedures, Functions and Triggers using PL/SQL to implement business rules and processes.
  • Created and modified existing functions and procedures based on business requirements.
  • Involved in developing ER Diagrams, Physical and Logical Data Models using Microsoft Visio.
  • Written complex SQL using joins, sub-queries and correlated sub-queries.
  • Used Analytic Functions to simplify and handle complex query problems in SQL.
  • Populated tables from the text files using SQL *LOADER.
  • Managed privileges on tables and other objects for outside schema users.
  • Implemented Table Partitioning and Sub-Partitioning to improve performance and data management.
  • Worked with business users to gather requirements for developing new reports.
  • Understood the future requirements and database model concepts accordingly.
  • Tuned and optimized complex SQL queries.

ENVIRONMENT: Oracle 10g, Windows, SQL, PL/SQL, SQL Developer, SQL*Loader, Microsoft Visio, Crystal Reports

Hire Now