We provide IT Staff Augmentation Services!

Sr.hadoop Developer Resume

2.00 Rating

Edison, NJ


  • IT experience, with over 2+ years of experience in all phases of Hadoop and HDFS development along with 4+ years of experience in analysis, design, development, testing and deploying various software applications with emphasis on Object Oriented Programming.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, and Flume.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom MapReduce programs in Java.
  • Knowledge of NoSQL databases such as Cassandra and HBase.
  • Involved in start to end process of Hadoop cluster installation, configuration and monitoring.
  • Responsible for building scalable distributed data solutions using Hadoop
  • Installed and configured Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Analyzed the features of Cassandra NoSQL DB with its setup, with various Clients by creating prototypes and implemented
  • Experience in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Used Cassandra to store the analyzed and processed data for scalability.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Strong experience as Applications Developer in 3-tier architecture, client/server technologies using VB, C#, Java Apps and SQL.
  • Experience in Database design, modeling, maintenance and administration using Oracle 10g, Postgres, MySql, DB2, HSQLDB, Derby and MS SQL Server 2005.
  • Proficiency in Programming Ruby, Javascript & NoSQL products such as PostgreSQL and Cassandra.
  • Experienced in Database development, ETL and Reporting tools using SQL Server DTS, SQL SSIS, SSRS, Crystal XI & SAP BO.
  • Used Spring Data for NoSQL HBase, Cassandra, & Hibernate/JPA for MySql & Derby. Developed & designed DB Schema. • Spin up and configure NoSQL database services
  • Experience in Agile Engineering practices. • Knowledge on reading data from Cassandra and also writing to it.
  • Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
  • Excellent interpersonal and communicatio


Sr.Hadoop Developer

Confidential - Edison, NJ


  • Responsible for Installation and configuration of Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Implemented basic CRUD functionality within the Rails MVC architecture with the Rails ActiveRecord pattern and NoSQL such as Redis, MongoDB databases.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Designed, Modeled & Deployed MySql database Schema & Mongo database. Implemented Caching and Single Signon. All implementation was an AWS Cloud Services.
  • Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
  • Developed Spark scripts by using Scala Shell commands as per the requirement.
  • Developed and implemented core API services using Scala and Spark.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • RESTful JSON APIs for mobile clients & NoSQL products such as PostgreSQL and Cassandra.
  • Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
  • Developed various PL/SQL scripts that for example: gathered Future and Options data traffic statistics, or created statistics listing the number of orders placed by firm.
  • Installing, Upgrading and Managing Hadoop Clusters
  • Administration, installing, upgrading and managing distributions of Hadoop, Hive, Hbase.
  • Advanced knowledge in performance troubleshooting and tuning Hadoop clusters.
  • Created Hive tables, loaded data and wrote Hive queries that run within the map.
  • Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources.
  • Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
  • Extensively worked on creating End-End data pipeline orchestration using Oozie.
  • Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
  • Processed the source data to structured data and store in NoSQL database Cassandra.
  • NoSQL databases: Redis, MongoDB, Amazon Web Services &Automated testing experience using tools such as Selenium
  • I have designed HBase table's schemas to load large sets of structured, semi-structured and unstructured data coming from SAP, AS/400, UNIX, NoSQL and a variety of portfolios.
  • Identified bottleneck in the database architecture and recommended SQL Federated architecture to handle large volume of transactions and process payments much quicker without failure
  • Evaluated suitability of Hadoop and its ecosystem to the above project and implementing / alidating with various proof of concept (POC) applications to eventually adopt them to benefit from the Big Data Hadoop initiative.

Environment: Map Reduce, HDFS, Hive, Pig, HBase, SQL, Sqoop, Flume, Oozie, Apache Kafka, Zookeeper, J2EE, Eclipse, Cassandra.

Sr.Hadoop Developer



  • Extracted files from MySQL, Oracle, and Teradata through Sqoop and placed in HDFS and processed.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources
  • Wrote map-reduce views to extract data from NoSQL database CouchDB.
  • Worked on connecting to a 5-node Cassandra cluster from java using DataStax Java Driver and developed a web application used for searching.
  • Assisted in exporting analyzed data to relational databases using Sqoop
  • Configured the Hadoop cluster in Local (Standalone), Pseudo-Distributed, Fully-Distributed mode with the use of Apache, Cloudera distributions and Cloudera manager.
  • Integrated proprietary software with social media, Google Maps APIs and NoSQL
  • Worked with the Data Science team to gather requirements for various data mining projects
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Migrated ETL processes from Oracle, MSQL to Hive to test the easy data manipulation
  • Developed Simple to complex MapReduce Jobs using Hive and Pig
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way
  • Conducted POC for Hadoop and Cassandra as part of Nextgen platform implementation. Includes connecting to Hadoop cluster and Cassandra ring and executing sample programs on servers.
  • Involved in running Hadoop jobs for processing millions of records of text data
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
  • Configured Zoo Keeper, Cassandra & Flume to the existing hadoop cluster.
  • Cassandra Database redeployed to cut costs. Upgraded to 2.0 from 1.25. Cassandra NoSQL database is currently running on a ring of 24 nodes in 2 different datacenters.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
  • Worked on evaluation of MongoDB and other NoSQL databases for text-mining from user-forums on website created for cancer-patients
  • Performance tuning and stress-testing of NoSQL database environments in order to ensure acceptable database performance in production mode.
  • Testing, evaluation and troubleshooting of different NoSQL database systems and cluster configurations to ensure high-availability in various crash scenarios.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
  • Develop Shell scripts for automate routine tasks.
  • Used Oozie and Zookeeper operational services for coordinating cluster and scheduling workflows

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Flume, ETL tools LINUX, and Big Data

Java Developer

Confidential - Dallas, TX


  • Involved in developing code for obtaining bean references in spring framework using Dependency Injection (DI) or Inversion of Control (IoC).
  • Written sample hive based application that loads in the Hive from the Cassandra with Snappy
  • Extensively used easy mock objects in the Junit and gave test coverage to the classes and methods.
  • Worked with Java, J2EE, SQL, Hibernate, Java Script, Web Servers and spring
  • Wrote module for ingest and backup of raw aerial imagery in-storage cluster with NoSQL metadata storage
  • Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
  • Processed the source data to structured data and store in NoSQL database Cassandra.
  • Created alter, insert and delete queries involving lists, sets and maps in DataStax Cassandra.
  • Design and develop JAVA API (Commerce API) which provides functionality to connect to the Cassandra through Java services.
  • Writing the functional Test Cases for the user stories.
  • Configured and installed Apache Hadoop, Hive and Hbase.
  • Extensively used agile methodology Scrum during development of the project and oversee the software development in sprints by attending daily stand-up meetings and giving status.
  • Involved in building the user screens using Groovy Server Pages (GSP)'s for the application.
  • Responsible in creating and coding Hadoop Map Jobs and Reduce jobs for processing data or logic for filtering data
  • Developed User Interface using JavaScript and HTML
  • Developed different grovy pages as a communication channel between client and service layers.
  • Used ZooKeeper for providing services that enable synchronization across a cluster for big data
  • Implemented persistence layer using Hibernate and writing SQL queries based on Hibernate criteria API.
  • Have strong programming skills in Core Java and Multi-Threaded applications
  • Participated in code Submissions, Code Reviews, updating design documents and troubleshooting.
  • Involved in Installing, Configuring Hadoop Eco System, and Cloudera Manager using CDH3 Distribution.
  • Good understanding and related experience with Hadoop stack - internals, Hive, Pig and MapReduce.
  • Configured Zoo Keeper, Cassandra & Flume to the existing hadoop cluster.
  • Hands-on experience with Hadoop applications (such as administration, configuration management, monitoring, debugging, and performance tuning)
  • Created and implemented Struts.xml file for various configurations.
  • Loading unstructured data in Hadoop File System(HDFS) by making use of different objects like Configuration, FileSystem, FSDataOutputStream.
  • Used Log4J for logging using different logger levels like info, debug and error.
  • Used spring IOC for creating the beans to be injected at the run time.
  • Generated pdf using iText for reading, formatting and creating.
  • Hands on experience with NoSQL databases like HBase, Cassandra for POC (proof of concept) in storing URL's, images, products and supplements information at real time.
  • Involved in all the phases of SDLC including requirements & analysis, design, release and maintenance of customer specifications, development and customization of the application and applied software engineering principles.
  • Developed User Interface using JavaScript, JSP and HTML.
  • Experience in web scripting technologies like Java Script and HTML
  • Developed portlet kind of user experience using Ajax, jquery, grails and groovy.
  • Involved in running Hadoop jobs for processing millions of records of text data
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
  • Worked with Java, J2EE, SQL, Hibernate, XML, JavaScript, web servers
  • Extensively used spring tool suite(STS) as the ide for the development
  • Extensively used jquery script for client side side javascript methods.
  • Fixing the bugs at Development, QA and Production phases.

Environment: Unix, JDK 1.7, spring 2.5, JUnit 4.5, SVN, Hibernate 3.3, Zookeeper, Cloudera Manager standard 4.1.2, Hadoop, Map Reduce, Hbase, HDFS, HTML, JSP, CSS, Grails 2.2.3, Groovy 2.1.6, Sqoop, Oracle 11g, AJAX, Apache Tomcat Server7, iText, Spring Tool Suite(STS) 4.3.1, Java Script, Jira, fish eye, Log4J

We'd love your feedback!