We provide IT Staff Augmentation Services!

Java/hadoop Lead (consultant) Resume

Basking Ridge, NJ


  • Hadoop/Java Developer with 8+ years of overall IT experience in a variety of industries, which includes hands on experience in Big Data technologies
  • 4 years of comprehensive experience in Big Data processing using Apache Hadoop, Adminand its ecosystem (MapReduce, Pig, Hive, Sqoop, Flume and HBase), Spark and Zipkins.
  • Provided key oversight throughout the development lifecycle, with involvement in requirements gathering and analysis, application review.
  • Experience in installing, configuring and maintaining the HadoopCluster
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoopcluster.
  • Knowledge of administrative tasks such as installing Hadoop (on Ubuntu) and its ecosystem components such as Hive, Pig, sqoop.
  • Good knowledge about YARN configuration.
  • Passionate towards working in Big Data and Analytics environment.
  • Involved in HDFS maintenance and administering it throughHadoop - Java API.
  • Worked with application team via scrum to provide operational support, installHadoopupdates, patches and version upgrades as required.
  • Hands on experience withSpark-Scala programming.
  • Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java.
  • Wrote Hive queries for data analysis to meet the requirements
  • Created Hive tables to store data into HDFS and processed data using Hive QL
  • Expert in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
  • Good knowledge in creating Custom Serdes in Hive
  • Developed Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, COGROUP, ORDER, LIMIT, UNION, SPLIT to extract data from data files to load into HDFS.
  • Extending Hive and Pig core functionality by writing custom UDFs
  • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data
  • Responsible for continuous monitoring and managingElasticMapReduce cluster through AWS console.
  • Experience with ETL - Extract Transform and Load - Talend Open Studio, Informatica.
  • Evaluation of ETL (Talend) and OLAP tools and recommend the most suitable solutions based on business needs.
  • Experience in automating theHadoopInstallation, configuration and maintaining the cluster by using the tools like Chef.
  • Good knowledge in Linux shell scripting or shell commands, Python and WLST.
  • Expertise in designing python scripts to interact with middleware/back end services.
  • Experience in using IDEs like Eclipse, IntelliJ.
  • Working knowledge on Scala.
  • Hands on experience in dealing with Compression Codecs like Snappy, Gzip.
  • Good understanding of Data Mining and Machine Learning techniques
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
  • Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
  • Hands on experience in provisioning and managingHadoopclusters on publiccloudenvironment open stackcloudplatform.
  • Extensive experience with SQL, PL/SQL and database concepts
  • Knowledge of NoSQL databases such as Hbase, Cassandra.
  • Also used Hbase in accordance with PIG/Hive as and when required for real time low latency queries.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie (hive, pig) and Zookeeper (Hbase).
  • Experience withcloudtechnologies. (VMware, AWS, Google Cloud)
  • Experience inRubyonRails and Object Oriented programming.
  • Have very good experience in exporting, extracting of analyzed data and generating various visualizations using Business Intelligence toolTableaufor better analysis of data.
  • Experience in developing solutions to analyze large data sets efficiently
  • Good understanding of XML methodologies (XML, XSL, XSD) including Web Services like REST, SOAP.
  • UsedApachecamelto integrate different applications. DevelopedCamelOrchestration layers to integrate different components of the application.
  • Strong experience as a senior Java Developerin Web/intranet, Client/Server technologies using Java, J2EE, Servlets, JSP, EJB, JDBC.
  • Strong analytical and Problem solving skills.
  • Good Inter personnel skills and ability to work as part of a team. Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines


Hadoop/Big Data: HDFS, Mapreduce, HBase, Pig, Hive, Sqoop, Flume, MongoDB, Cassandra, Power pivot, Puppet, Oozie, Zookeeper, Kafka, Spark

Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, Java Beans, Apache Camel

IDE’s: Eclipse, IntelliJ, JBoss Developer Studio

Big data Analytics: Datameer 2.0.5

Frameworks: MVC, Struts, Hibernate, Spring

Programming languages: C, C++, Java, Python, Ant scripts, Linux shell scripts, Scala 2.0

Databases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, AJAX, REST, SOAP, WSDL

ETL Tools: Informatica, Talend


Confidential, Basking Ridge, NJ

Java/Hadoop Lead (Consultant)


  • Designing and developing Web applications for Claims Processing using Apache Camel, Spring and Restful web services.
  • Developed and maintained applications (web and desktop), coordinating requirements gathering and analysis, application review, data reporting, business logic, and technical specifications.
  • Helped to mentor a total of 8 Java/Hadoop Developers, 3 Junior Developers and 5 Senior Developers
  • Completed Code Reviews, Reviewed demos for End to end Development solutions and provided expertise helped to implement a new technology stack.
  • Worked on deployment, installation, configuration and issues with Application servers like Apache JBoss, and Apache Tomcat 7.
  • Worked in Agile Development (Rally) and maintained good level of communication in team and good understanding of Service Oriented Architecture (SOA).
  • Developed projects and products using SDLC (Software development life cycle), from initiation, planning, designing, execution and implementation, development.
  • Built the responsive UI screens using HTML, CSS, JavaScript, JSP and Bootstrap framework.
  • Experience in developing Applications with Web Services, WSDL, SOAP.
  • Worked on building web services using Spring and CXF, offering both REST and SOAP interfaces.
  • Implemented Apache Camel routing to integrate with different systems and provide end-to-end communications between the web services and other enterprise services.
  • Experience in performing Unit test by using JUnit.
  • Importing and exporting data into HDFS using Sqoop.
  • Created PIG Scrips to perform Extract Transform and Load (ETL) operations.
  • Involved in creating Hive Tables, loading with data and writing Hive queries.
  • Loaded data in elastic search from Datalake and Optimized the full search function using Elastic Search.
  • Implemented distributed tracing by integrating zipkins.

Environment: JAVA/J2EE, JDK 1.7, HTML, CSS, Servlets, JSP, XML, XSLT, WSDL, SOAP, CXF, REST, JUNIT, SOAPUI, JNDI, Apache CXF Web services, JBOSS, Apache Camel, Zipkins, Hive, Sqoop, Pig, Git.

Confidential, Chicago, IL

Hadoop developer/Lead (Consultant)


  • UsedSqooptool to extract data from a relational database intoHadoop.
  • Worked on HUE interface Environment for querying the data.
  • Analyzed the web log data using the HiveQL.
  • Installed and configured Hive and also written Hive UDFs.
  • Used HiveQL for writing Efficient JOINS on HIVE Tables without degrading Performance.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way
  • Worked on JAVA program for connectivity to HBase.
  • Used Oozie workflow to automate all jobs for pulling the data from HDFS and loading data into Hive tables.
  • Involved in enhancing the speed performance using ApacheSpark.
  • DevelopedSparkcode using scala andSpark-SQL for batch processing of data.
  • Involved in parsing JSON data into structured format and loading into HDFS/Hive usingspark streaming.
  • Involved in running Hadoop streaming jobs to generate terabytes of xml format data.
  • Used Zookeeper to co-ordinate cluster services.
  • Involved in build applications using Maven and integrated with CI servers likeJenkinsto build jobs.
  • Used Git to check-in and checkout code changes.
  • Maintained, audited and built new clusters for testing purposes using the cloudera manager.

Environment: Hadoop, HDFS, Spark, Hive, Hbase, Sqoop, Cloudera CDH5, Oozie, IntelliJ, Zookeeper, Jenkins, Chef, Git, Talend.

Confidential, Framingham, MA

Hadoop developer/Lead (Consultant)


  • Installing and configuring fully distributed Hadoop Cluster.
  • Installing Hadoop Eco-system Components (Pig, Hive and Hbase).
  • Involved in Hadoop Cluster environment administration that includes cluster capacity planning, performance tuning, cluster Monitoring and Troubleshooting.
  • Creating and configuring Hadoop cluster in Cloudera.
  • InstalledHadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Consulting onHadoopecosystem:Hadoop, Admin, MapReduce, Hbase, Sqoop, amazonElastic Map Reduce (EMR)
  • Coordinating and managing relations with vendors, IT developers and end users.
  • Managing the work streams, process and coordinate the team members and their activities to ensure that the technology solutions are in line with the overall vision and goals.
  • Analyzed the web log data using the HiveQL.
  • Developed custom aggregate functions usingSparkSQL and performed interactive querying.
  • Worked on analyzingHadoopcluster and different big data analytic tools including Pig, Hbase NoSQL database and Sqoop.
  • Integrated Cassandra Querying Language called CQL for Apache Cassandra.
  • Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Developed workflows using custom MapReduce, Pig, Hive, Sqoop.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Written the Apache PIG scripts to process the HDFS data and send the data to Hbase.
  • Used Kafka to load data in to HDFS and move data into NoSQL databases.
  • Configuration of various database connectivity (Oracle11g, SQL Server 2005)
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources.
  • Involved in runningHadoopjobs for processing millions of records of text data. Troubleshoot the build issue during the Jenkins build process. ImplementDockerto create containers for Tomcat Servers, Jenkins.
  • Worked on and maintained an application underRubyonRailson Linux platform with MySQL as database.
  • Experience in supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services (AWS)cloud. Performed Export and import of data into S3.
  • Created MapReduce jobs using Hive/Pig Queries.
  • Supported Map Reduce Programs those are running on the cluster
  • Generated various marketing reports usingTableauwithHadoopas a source for data.
  • Involving in troubleshooting, performance tuning of reports and resolving issues withinTableau Server and Reports.
  • Experience in providing security for Hadoop Cluster with Kerberos
  • Cluster coordination services through Zoo Keeper.
  • Installed and configured Hive and also written Hive UDFs.
  • Automated all the jobs, for pulling data from FTP server to load data into Hive tables,Using Oozie workflows.

Environment: Cassandra, MapReduce, HDFS, Hive, Flume, Cloudera Manager, Sqoop MySQL, UNIX Shell Scripting, Zookeeper, Tableau, Git, Spark, Kafka, Elastic Search, Docker, Ruby on Rails.

Confidential, Chicago, IL

Hadoop Developer/Admin (Consultant)


  • Responsible for coding MapReduceprogram,Hivequeries, testing and debugging the MapReduce programs.
  • DevelopedPiglatin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
  • Installed and configured Hive and also implemented various business requirements by writing HIVE UDFs.
  • UsedSqooptool to extract data from a relational database intoHadoop.
  • Experience in pulling data from Amazon S3cloudto HDFS.
  • Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
  • Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
  • Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
  • Installed and configured Hadoop cluster in DEV, QA and Production environments.
  • Strongly recommended to bring in Elastic Search and was responsible for installing, configuring and administration.
  • Performed upgrade to the existing Hadoop clusters.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
  • Implemented Commissioning and Decommissioning of new nodes to existing cluster
  • Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
  • Responsible for data ingestions using Talend.
  • Worked on Integration of Big data andcloudplatforms Using Talend.
  • Designed and presented plan for POC on impala.
  • Experienced in migrating Hive QL into Impala to minimize query response time.
  • Monitoring workload, job performance and capacity planning using Cloudera Manager.
  • Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.

Environment: Hadoop, HDFS, Map Reduce, Hive, Flume, Sqoop, Cloudera CDH4, HBase, Oozie, Pig, AWS EC2 cloud, Eclipse, Talend.

Confidential, Buffalo, NY

Hadoop/Java Developer/Admin (Consultant)


  • Worked on analyzing, writing Hadoop MapReduce jobs using Java API, Pig and Hive.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Involved in loading data from edge node to HDFS using shell scripting.
  • Worked on installing cluster, commissioning & decommissioning of datanode, namenode high availability, capacity planning, and slots configuration.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Worked with NoSQL databases like Hbase in creating Hbase tables to load large sets of semi structured data coming from various sources.
  • Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Load and transform large sets of structured, semi structured and unstructured data
  • Formulated ETL processes (Using Talend) to Extract Data.
  • Develop ETL mappings and workflows using Talend for source system’s data extraction and data transformations.
  • Experience in managing and reviewing Hadoop log files.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Worked on python scripts to analyze the data.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Python, Ubuntu, Linux Red Hat, Talend.


JAVA Developer


  • Involved in Design, Development and Support phases of Software Development Life Cycle (SDLC)
  • Reviewed the functional, design, source code and test specifications
  • Involved in developing the complete front end development using Java Script and CSS
  • Author for Functional, Design and Test Specifications
  • Implemented Backend, Configuration DAO, XML generation modules of DIS
  • Analyzed, designed and developed the component
  • Used JDBC for database access
  • Used Spring Framework for developing the application and used JDBC to map to Oracle database.
  • Used Data Transfer Object (DTO) design patterns
  • Unit testing and rigorous integration testing of the whole application
  • Written and executed the Test Scripts using JUNIT
  • Actively involved in system testing
  • Developed XML parsing tool for regression testing
  • Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product

Environment: Java, JavaScript, HTML, CSS, JDK 1.5.1, JDBC, Oracle10g, XML, XSL, Solaris and UML


JAVA Developer


  • Involved in Analysis, design and coding on J2EE Environment.
  • Implemented MVC architecture using Struts, JSP, and EJB's.
  • Worked on Hibernate object/relational mapping according to database schema.
  • Presentation layer design and programming on HTML, XML, XSL, JSP, JSTL and Ajax.
  • Designed, developed and implemented the business logic required for Security presentation controller.
  • Used JSP, Servlet coding under J2EE Environment.
  • Designed XML files to implement most of the wiring need for Hibernate annotations and Struts configurations.
  • Responsible for developing the forms, which contains the details of the employees, and generating the reports and bills.
  • Involved in designing of class and dataflow diagrams using UML Rational Rose.
  • Used CVS for maintaining the Source Code Designed, developed and deployed on Apache Tomcat Server.
  • Created and modified Stored Procedures, Functions, Triggers and Complex SQL Commands using PL/SQL.
  • Involved in the Design of ERD (Entity Relationship Diagrams) for Relational database.
  • Developed Shell scripts in UNIX and procedures using SQL and PL/SQL to process the data from the input file and load into the database.
  • Used Core Javaconcepts in application such as multithreaded programming, synchronization of threads used thread wait, notify, join methods etc.
  • Creating cross-browser compatible and standards-compliant CSS-based page layouts.
  • Involved in maintaining the records of the patients visited along with the prescriptions they were issued in the Database.
  • Performed Unit Testing on the applications that are developed.

Environment: Unix (Shell Scripts), Eclipse, Java(jdk1.6), J2EE, JSP1.0, Servlets, Hibernate, JavaScript, JDBC, Oracle 10g, UML, Rational Rose 2000, WebLogic Server, Apache Ivy, JUnit, SQL, PL/SQL, CSS, HTML, XML

Hire Now