We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

New York, NY


  • Over 7.5+ years of professional experience in IT industry, with 3.5+ years of hands - on expertise in Big Data processing using Hadoop, Hadoop Ecosystem implementation, maintenance, ETL and Big Data analysis operations.
  • Excellent understanding of Hadoop architecture and underlying frameworks including distributed storage (HDFS), resource management (YARN) and MapReduce programming paradigm.
  • Highly experienced in using various Hadoop Ecosystem projects such as MapReduce, Pig, Hive, ZooKeeper, HBase, Sqoop, Flume and Oozie for ingestion, storage, querying, processing and analysis of Big Data.
  • Hands-on expertise in writing MapReduce jobs using Java.
  • Excellent understanding of Hadoop architecture and underlying framework including storage management.
  • Knowledge of architecture and functionality of NOSQL DB like HBase, Cassandra and MongoDB.
  • Experience in managing Hadoop clusters and services using Cloudera Manager.
  • Collected logs data from various sources and integrated in to HDFS using Flume.
  • Knowledge of project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Experience in analyzing data using Pig and Hive and also developing custom User Defined Functions to in corporate methods and functionality of Java into PigLatin and HiveQL.
  • Experience in using ZooKeeper distributed coordination service for High-Availability.
  • Good knowledge of using Oozie Workflow Scheduler system for MapReduce jobs.
  • Good understanding of the architecture and functionality of NoSQL database ( HBase).
  • Experience in importing and exporting data between HDFS and RDBMS using Sqoop.
  • Proficient knowledge of collecting logs data from various sources and integrating into HDFS using Flume.
  • Extensive experience in Hadoop Administration activities such as installation, configuration, maintenance and deployment of Big Data components and the underlying infrastructure of Hadoop cluster.
  • Experience in Hadoop cluster planning, monitoring and troubleshooting.
  • Hands-on expertise on multi node cluster setup using Apache Hadoop and Cloudera Hadoop distributions.
  • Experience in managing and monitoring of Hadoop cluster and services using Cloudera Manager.
  • Experience in optimizing the configurations of MapReduce, pig and hive jobs for better performance.
  • Red Hat certified System Administrator on Red Hat Enterprise Linux 7.
  • Good working knowledge in Puppet for configuration management.
  • Experience in working with Java /J2EE environment as a developer.
  • Well versed in Object Oriented Programming and Software Development Life Cycle.
  • In-depth understanding of Data Structures and Algorithms and Optimization.
  • Extensive experience with Databases such as Oracle 10g/11g, Microsoft SQL Server, MySQL.
  • Experience in writing complex SQL queries, PL/SQL Procedures, Functions and Triggers.


Hadoop and Big Data Technologies: Apache Hadoop, Cloudera CDH4/CDH5, HDFS, MapReduce, YARN, Pig, Hive, Sqoop, Flume, ZooKeeper, Oozie, HBase, Cloudera Manager, Hue

Databases: Oracle 10g/11g, Microsoft SQL Server, MySQL, PostgreSQL, MS Access

Programming Languages: C, C++, Java

Scripting & Query Languages: Bash Shell Scripting, SQL, PL/SQL

Java/Web Technologies: J2EE, Spring, Hibernate, JSP, Servlets, JMS, HTML, CSS, XML, Javascript, AJAX

Operating Systems: Linux(Red Hat, CentOS, Ubuntu ), Windows

Linux Servers: DNS, DHCP, FTP, NFS, NIS, LDAP, SAMBA, SQUID, NTP, Apache Webserver

Storage Management: Disk Partitioning, SWAP Partitioning, iSCSI Storage, LVM, Disk Quota, RAID

Securities: Firewalld, IPTables, Kerberos, SSL/TLS, SELinux, SSH, TCP Wrappers

Virtualizations and Cloud: VirtualBox, VMware Workstation, Amazon EC2

Networking: TCP/IP, Ethernet Networking, IP Addressing, IP Subnetting

Monitoring/Troubleshooting: Nagios core, Ganglia, WhatsUp Gold, Top, Netstat, IPTraf, Tcpdump, nmap, ps

Configuration Management: Puppet

Development Tools: Eclipse, NetBeans, Oracle SQL Developer, SQL *Plus Instant Client

Other Tools: Microsoft Visio, Crystal Reports, PuTTy


Confidential, New York, NY

Senior Hadoop Developer


  • Worked in a team that built big data analytic solutions using Cloudera Hadoop Distribution.
  • Migrated RDBMS (Oracle, MySQL) data into HDFS and Hive using Sqoop.
  • Extensively used Pig scripts for data cleansing and optimization.
  • Worked on creating and optimizing Hive scripts based on business requirements.
  • Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Involved in loading data from UNIX file system to HDFS.
  • Wrote MapReduce jobs to discover trends in data usage by users.
  • Used Map Reduce JUnit for unit testing.
  • Involved in managing and reviewing Hadoop log files.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Loaded and transformed large sets of structured, semi structured and unstructured data.
  • Exported the result set from HIVE to MySQL using Shell scripts.
  • Used Zookeeper for various types of centralized configurations.
  • Involved in maintaining various Unix Shell scripts.
  • Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
  • Created and managed Hive external tables in Hadoop for Data Warehousing.
  • Developed customized UDFs in Java for extending Pig and Hive functionality.
  • Extracted the needed data from the server into HDFS and Bulk-Loaded the cleaned data into HBase.
  • Integrated the Hive Warehouse with HBase.
  • Configured Oozie to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Worked on optimization and performance tuning of MapReduce, Hive & Pig Jobs.

ENVIRONMENT: CDH5, HDFS, MapReduce, Pig, Hive, Sqoop, HBase, Oozie, Hue, Java, Linux

Confidential - New York, NY

Hadoop Developer


  • Worked on implementation and maintenance of Cloudera Hadoop Cluster.
  • Monitored workload, job performance and node health using Cloudera Manager.
  • Involved in setting up High-availability Hadoop cluster using Quorum Journal Manager (QJM) and ZooKeeper.
  • Experienced in Importing and exporting data into HDFS and Hive using Sqoop.
  • Loaded and transformed large sets of structured, semi structured and unstructured data.
  • Responsible for managing data coming from different sources.
  • Gained good experience with NoSQL database.
  • Involved in creating Hive tables, loading with data and writing Hive queries, which will run internally in map, reduce way.
  • Proactively monitored and handled systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, disaster recovery systems & procedures and Hadoop log files.
  • Assisted in configuration and maintenance of various Hadoop infrastructures like Pig, Hive, and HBase.
  • Imported data from RDBMS (MySQL, Teradata) to HDFS and vice versa using Sqoop (Big Data ETL tool) for Business Intelligence, visualization and report generation.
  • Used Flume to collect and aggregate web log data from different sources and pushed to HDFS.
  • Developed and executed custom MapReduce programs, PigLatin scripts and HiveQL queries.
  • Developed and optimized Pig and Hive UDFs to implement the methods and functionality of Java as required.
  • Analyzed user request patterns and implemented various performance optimization measures including but not limited to implementing partitions and buckets in HiveQL.

ENVIRONMENT: CDH4, HDFS, YARN, MapReduce, Pig, Hive, Sqoop, Flume, ZooKeeper, Nagios, Java, Linux, Shell Scripting

Confidential - New York, NY

Big Data Consultant / Hadoop Consultant


  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Collaborated with the systems engineering team to plan and deploy hardware and software environments required for new Hadoop environments and to expand existing Hadoop cluster.
  • Logical Implementation and interaction with HBase.
  • Used Fast script alternative to scoop to automate transfer of data from oracle to HBase.
  • Performed data analysis using Hive and Pig.
  • Loaded log data into HDFS using Flume.
  • Worked with application teams to install Hadoop updates, patches, version upgrades as required
  • Installed and configured various components of Hadoop Ecosystem and maintained their integrity.
  • Responsible for cluster maintenance, commissioning and decommissioning of DataNodes, cluster monitoring, troubleshooting, managing of data backups and disaster recovery systems, analyzing Hadoop log files.
  • Configured Hive using shared meta-store in MySQL and used Sqoop to migrate data into External Hive Tables from different RDBMS sources (Oracle, Teradata and DB2) for Data warehousing .
  • Developed Hive queries and custom UDFs to bring data into a structured format.
  • Installed and implemented monitoring tools like Ganglia and Nagios to monitor Hadoop cluster environment.

ENVIRONMENT: Hadoop 1x, HDFS, MapReduce, Pig, Hive, Sqoop, Nagios, Ganglia, MySQL, Linux, NFS, Shell scripting

Confidential - Whitehouse Station, NJ

Java/J2EE Developer - Consultant


  • The entire application was developed in J2EE using Spring MVC based architecture.
  • Involved in design and interacted with business intelligence team during requirement analysis.
  • Involved in developing Data models and Database schemas.
  • Used Hibernate ORM framework for data persistence layer.
  • Involved in creating the Hibernate POJO Objects and mapped using Hibernate Annotations.
  • Developed user interface using JSP, HTML, XML and CSS to simplify the complexities of the application.
  • Used JMS for asynchronous messaging between different modules.
  • Used WebSphere Application Server to develop and deploy the application.
  • Actively participated in client meetings and taking the inputs for the additional functionality.
  • Actively involved in code review and bug fixing for improving the performance.
  • Involved in coding for JUnit Test cases.

ENVIRONMENT: Java/J2EE, spring, Hibernate, JSP, Servlets, JMS, HTML, XML, CSS, SQL, PL/SQL, Oracle 11g, WebSphere

Confidential - New York, NY

Java/J2EE Developer - Consultant


  • Involved in various phases of development life cycle of the project and participated in daily SCRUM meetings.
  • Participated in JAD meetings to gather the requirements and understand the End-user Systems.
  • Created detailed design documents using UML (Use case and Sequence diagrams).
  • Implemented MVC design pattern using Spring Framework.
  • Implemented the functionality of mapping entities to the database using Hibernate.
  • Used Spring framework to define beans for Entity and corresponding dependent services.
  • Used HTML, CSS, XML, JavaScript and JSP for interactive cross browser functionality and complex user interface.
  • Written queries, stored procedures and functions using SQL, PL/SQL in Oracle.
  • Involved in Bug fixing of various modules that were raised by the testing teams in the application.
  • Re-designed core framework and database access by replacing JDBC and using Spring and Hibernate.
  • Worked on performance tuning at different level starting from database level moving up to application level.

ENVIRONMENT: : Java/J2EE, spring, Hibernate, JSP, Servlets, HTML, CSS, JavaScript, XML, PL/SQL, Oracle 10g, WebLogic

Confidential, Wayne, NJ

Oracle Developer - Consultant


  • Designed, developed and maintained Oracle database schemas, tables, standard views, materialized views, synonyms, indexes, constraints, sequences, cursors and other database objects.
  • Wrote Stored Procedures, Functions and Triggers using PL/SQL to implement business rules and processes.
  • Created and modified existing functions and procedures based on business requirements.
  • Involved in developing ER Diagrams, Physical and Logical Data Models using Microsoft Visio.
  • Written complex SQL using joins, sub-queries and correlated sub-queries.
  • Used Analytic Functions to simplify and handle complex query problems in SQL.
  • Populated tables from the text files using SQL *LOADER.
  • Managed privileges on tables and other objects for outside schema users.
  • Implemented Table Partitioning and Sub-Partitioning to improve performance and data management.
  • Worked with business users to gather requirements for developing new reports.
  • Understood the future requirements and database model concepts accordingly.
  • Tuned and optimized complex SQL queries.

ENVIRONMENT: Oracle 10g, Windows, SQL, PL/SQL, SQL Developer, SQL*Loader, Microsoft Visio, Crystal Reports

Hire Now