We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

Richardson, TX


  • Over 7 years of professional IT experience which includes 3+ years of experience in Hadoop, Big data ecosystem related technologies.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node, Resource Manager, Node Manager and MapReduce programming paradigm.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components likeHadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper, Flume and kafka.
  • Experience on Apache Hadoop technologies Hadoop distributed file system (HDFS), MapReduce framework, YARN, Pig, Hive, HCatalog, Sqoop, Flume and Kafka.
  • Extensive hold over Hive and Pig core functionality by writing custom UDFs.
  • Led many Data Analysis & Integration efforts involving HADOOP along with ETL.
  • Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS. Good Knowledge on Hadoop Cluster administration, monitoring and managing Hadoop clusters using Cloudera Manager.
  • In - depth understanding of Data Structure and Algorithms.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in NoSQL database HBase.
  • Experience in analyzing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2E E design patterns and Core Java design patterns.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Experience in Java, JSP, Servlets, WebLogic, WebSphere, JDBC, XML, and HTML
  • Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.


Programming /Scripting Languages: Java, C, J2EE, Unix Shell / Python Scripts

Web /XML Technologies: HTML, CSS, JavaScript, AJAX, Servlets, JSP, XML, XSLT, JAXB2.0

Hadoop-Big Data: Apache Hadoop, Map Reduce, PIG, HDFS, Hive, Sqoop, Oozie, Zookeeper, Flume, Kafka

NoSQL Database: HBase, DynamoDB

RDBMS: Oracle 9i, MS SQL Server

Development / Build Tools: Eclipse, Ant, Maven

Operating Systems: Windows, Linux, Unix


Hadoop Consultant

Confidential, Richardson, TX


  • Designed and developed Hadoop system to analyze the SIEM (Security Information and Event Management) data using MapReduce, HBase, Hive, Sqoop and Flume.
  • Migrated data from SQL Server to HBase using Sqoop.
  • Developed custom writable MapReduce JAVA programs to load web server logs into HBase using flume.
  • Log data Stored in HBase DB is processed and analyzed and then imported into Hive warehouse, which enabled end business analysts to write HQL queries.
  • Built re-usable Hive UDF libraries which enabled various business analysts to use these UDF's in Hive querying.
  • Developed various workflows using custom MapReduce, Pig, Hive and scheduled them using Oozie.
  • Using Pentaho generated the reports which are consumed by the business analysts.
  • Extensive knowledge in troubleshooting code related issues.
  • Configured various big data workflows to run on top of Hadoop and these workflows comprise of heterogeneous jobs like Pig, Hive, Sqoop and MapReduce.
  • Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library.
  • Integrated Kafka with Flume in sand box Environment using kafka source and kafka sink.
  • Configured flume agent with flume syslog source to receive the data from syslog servers.
  • Auto Populate Hbase tables with data coming from kafka sink.
  • Designed and coded application components in an agile environment utilizing test driven development approach.

    Environment: MapReduce, yarn2.0, HBase, Hive, Java, SQL, Pig, Sqoop, Oozie, Flume, Pentaho.

Hadoop Admin/ Developer

Confidential, Englewood, CO


  • Implemented 100 node CDH4 Hadoop cluster on Red hat Linux using Cloudera Manager.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Developed Simple to complex Map/Reduce Jobs using Hive and Pig.
  • Handled importing of data from various data source s, performed transformations using Hive, MapReduce, loaded data into HDFS and Migrated the data from MySQL to HDFS using Sqoop
  • Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
  • Implemented business logic by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Setup Amazon web services (AWS) to check whether Hadoop is a feasible solution or not.
  • Setup Hadoop cluster using EC2 (Elastic MapReduce) on managed Hadoop Frame Work.
  • Used Maven extensively for building MapReduce jar files and deployed it to Amazon Web Services (AWS) using EC2 virtual Servers in the cloud.
  • Used S3 Bucket to store the jar's, input datasets and used DynamoDB to store the processed output from the input data sets.

Environment: CDH4, Cloudera Manager, MapReduce, HDFS, Hive, Pig, HBase, Flume, MySQL, Sqoop, Oozie, AWS.

Hadoop Administrator

Confidential, San Jose, CA:


  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis
  • Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs for data cleaning.
  • Implemented Name Node backup using NFS. This was done for High availability.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Involved in the installation of CDH3 and up-gradation from CDH3 to CDH4.
  • Responsible for developing data pipeline using HD Insight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Responsible for developing data pipeline using HD Insight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
  • Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.
  • Automated workflows using shell scripts pull data from various databases into Hadoop
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux

Java Developer

Confidential, Buffalo, NY


  • Designed and developed various modules of the application with J2EE design architecture and frameworks like Spring MVC architecture and Spring Bean Factory using IOC, AOP concept.
  • Followed agile software development with Scrum methodology.
  • Wrote application front end with HTML, JSP, JSF, Ajax/JQuery, Spring Web Flow and XHTML.
  • Used J Query for UI centric Ajax behavior.
  • Implemented JAVA/J2EE design patterns such as Factory, DAO, Session Façade and Singleton.
  • Used Hibernate in persistence layer and developed POJO's, Data Access Object (DAO) to handle all database operations.
  • Implemented features like logging, user session validation using Spring-AOP module.
  • Developed server-side services using Java, Spring, Web Services (SOAP, WSDL, JAXB, JAX-RPC)
  • Worked on Oracle as the backend database.
  • Used JMS for messaging.
  • Used Log4j to assign, track, report and audit the issues in the application.
  • Develop and execute Unit Test plans using J Unit, ensuring that results are documented and reviewed with Quality Assurance teams responsible for integrated testing.
  • Worked in deadline driven environment with immediate feature release cycles.

Environment: Java, J2EE, JSP, Servlets, Hibernate, spring, Web Services, SOAP, WSDL, UML, HTML, XHTML, DHTML, JavaScript, J Query, CSS, XML, J Boss, Log4j, Oracle, J Unit, Eclipse.

Java Developer



  • Worked in deadline driven environment with immediate feature release cycles.
  • Involved in Analysis, Design, Coding and Development of custom Interfaces.
  • Involved in the feasibility study of the project.
  • Gathered requirements from the client for designing the Web Pages.
  • Participated in designing the user interface for the application using HTML, DHTML, and Java Server Pages (JSP)
  • Involved in writing Client side Scripts using Java Scripts and Server Side scripts using Java Beans and used Servlets for handling the business
  • Developed the Form Beans and Data Access Layer classes.
  • XML was used to transfer the data between different layers.
  • Involved in writing complex sub-queries and used Oracle for generating on-screen reports.
  • Worked on database interaction layer for insertions, updating and retrieval operations on data.
  • Deployed EJB Components on WebLogic.
  • Involved in deploying the application in test environment using Tomcat.
  • Identified System Requirements and Developed System Specifications, responsible for high-level design and development of use cases.
  • Involved in designing Database Connections using JDBC.
  • Involved in design and Development of UI using HTML, JavaScript and CSS.
  • Developed coded, tested, debugged and deployed JSPs and Servlets for the input and output forms on the web browsers.
  • Created Java Beans accessed from JSPs to transfer data across tiers.
  • Database Modification using SQL, PL/SQL, Stored procedures, triggers, Views in Oracle9i.
  • Experience in going through bug queue, analyzing and fixing bugs, escalation of bugs.
  • Involved in Significant customer interaction resulting in stronger Customer Relationships.
  • Responsible for working with other developers across the globe on implementation of common solutions.
  • Involved in Unit Testing.

Environment: Java, JSP, Servlets, EJB, Java Beans, JavaScript, JDBC, WebLogic Server, Oracle, HTML, DHTML, XML, CSS, Eclipse, Jdk1.6, Servlets, CVS.: Tomcat Web Server, Windows

Hire Now