We provide IT Staff Augmentation Services!

Big Data/hadoop Developer Resume

Arlington, VA


  • Above 8+ years of working experience as a Sr. Big Data/Hadoop Developer in designed and developed various applications like big data, Hadoop, Java/J2EE open - source technologies.
  • Hands on experience in advanced Big-Data technologies like Spark Ecosystem (Spark SQL, MLlib, Spark, R and Spark Streaming), Kafka and Predictive analytics
  • Knowledge of the software Development Life Cycle (SDLC), Agile and Waterfall Methodologies.
  • Experience on applications using Java, python and UNIX shell scripting
  • Experience in consuming Web services with Apache Axis using JAX-RS(REST) API’s.
  • Experienced in building tool Maven, ANT and logging tool Log4J.
  • Experience in Programming and Development of java modules for an existing web portal based in Java using technologies like JSP, Servlets, JavaScript and HTML, SOA with MVC architecture.
  • Expertise in ingesting real time/near real time data using Flume, Kafka, Storm
  • Good knowledge of NO SQL databases like Mongo DB, Cassandra and HBase.
  • Strong development skills in Hadoop, HDFS, Map Reduce, Hive, Sqoop, HBase with solid understanding of Hadoop internals.
  • Excellent knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MRA and MRv2 (YARN).
  • Hands on experience in installing, configuring and using Apache Hadoop ecosystem components like Hadoop Distributed File System (HDFS), MapReduce, Pig, Hive, HBase, Apache Crunch, Zookeeper, Scoop, Hue, Scala, AVRO.
  • Strong Programming Skills in designing and implementing of multi-tier applications using Java, J2EE, JDBC, JSP, JSTL, HTML, CSS, JSF, Struts, JavaScript, JAXB.
  • Extensive experience in SOA-based solutions - Web Services, Web API, WCF, SOAP including Restful APIs services
  • Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics.
  • Experience in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.
  • Good knowledge of NoSQL databases such as HBase, MongoDB and Cassandra.
  • Experience in working with Eclipse IDE, Net Beans, and Rational Application Developer.
  • Experience in using PL/SQL to write Stored Procedures, Functions and Triggers.
  • Expertise in developing a simple web based application using J2EE technologies like JSP, Servlets, and JDBC.
  • Experience working on EC2 (Elastic Compute Cloud) cluster instances, setup data buckets on S3
  • (Simple Storage Service), set EMR (Elastic MapReduce).
  • Work Extensively in Core Java, Struts2, JSF2.2, Spring, Hibernate, Servlets, JSP and Hands-on experience with PL/SQL, XML and SOAP.
  • Well verse working with Relational Database Management Systems as Oracle, MS SQL, MySQL Server
  • Hands on experience on Hadoop /Big Data related technology experience in Storage, Querying, Processing and analysis of data.
  • Hands on experience in working on XML suite of technologies like XML, XSL, XSLT, DTD, XML Schema, SAX, DOM, JAXB.
  • Experience in working with Web Servers like Apache Tomcat and Application Servers like IBM Web Sphere and JBOSS.


Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig 0.17, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper

Cloud Platform: Amazon AWS, EC2, EC3, MS Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake, Data Factory

Hadoop Distributions: Cloudera, Hortonworks, MapR

Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets

Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS

Web Technologies: HTML, CSS, JavaScript, JQuery 3.3, Bootstrap 4.1, XML, JSON, AJAX

Databases: Oracle 12c/11g, SQL

Operating Systems: Linux, Unix, Windows 10/8/7

IDE and Tools: Eclipse 4.7, NetBeans 8.2, IntelliJ, Maven

NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB

Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere

SDLC Methodologies: Agile, Waterfall

Version Control: GIT, SVN, CVS


Confidential - Arlington, VA

Big Data/Hadoop Developer


  • As a Sr. Big Data/Hadoop Developer worked on Hadoop eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
  • Developed Big Data solutions focused on pattern matching and predictive modeling
  • Involved in Agile methodologies, daily scrum meetings, spring planning.
  • Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in Java for data cleaning and Pre-processing.
  • Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
  • Built code for real time data ingestion using Java, Map R-Streams (Kafka) and STORM.
  • Involved in various phases of development analysed and developed the system going through Agile Scrum methodology.
  • Worked on Apache Solr which is used as indexing and search engine.
  • Involved in development of Hadoop System and improving multi-node Hadoop Cluster performance.
  • Worked on analysing Hadoop stack and different Big data tools including Pig and Hive, HBase database and Sqoop.
  • Developed data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked with different data sources like Avro data files, XML files, JSON files, SQL server and Oracle to load data into Hive tables.
  • Used J2EE design patterns like Factory pattern & Singleton Pattern.
  • Used Spark to create the structured data from large amount of unstructured data from various sources.
  • Implemented usage of Amazon EMR for processing Big Data across Hadoop Cluster of virtual servers on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3)
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, Impala and loaded final data into HDFS.
  • Developed Python scripts to find vulnerabilities with SQL Queries by doing SQL injection.
  • Experienced in designing and developing POC's in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
  • Extracted Real time feed using Spark streaming and convert it to RDD and process data into Data Frame and load the data into Cassandra.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
  • Implemented a distributed messaging queue to integrate with Cassandra using Apache Kafka and Zookeeper.
  • Specified the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Imported weblogs & unstructured data using the Apache Flume and stores the data in Flume channel.
  • Exported event weblogs to HDFS by creating a HDFS sink which directly deposits the weblogs in HDFS.
  • Used RESTful web services with MVC for parsing and processing XML data.
  • Utilized XML and XSL Transformation for dynamic web-content and database connectivity.
  • Involved in loading data from UNIX file system to HDFS. Involved in designing schema, writing CQL's and loading data using Cassandra.
  • Built the automated build and deployment framework using Jenkins, Maven etc.

Environment: Hadoop 3.0, Hive 2.3, MongoDB, Spark, Agile, Github, Jenkins, MapReduce, Jenkins, Java, Kafka 2.0, Solr, Pig 0.17, Sqoop 1.4, HBase, XML, Avro, JSON, Impala, SQL, Oracle 12C, Cassandra 3.11, Scala, Zookeeper, UNIX, Maven.

Confidential - Lowell, AR

Hadoop Developer


  • Involved in all phases of Software Development Life Cycle (SDLC) using Agile.
  • Responsible for building and configuring distributed data solution using MapR distribution of Hadoop.
  • Involved in complete Big Data flow of the application data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data.
  • Involved in Agile methodologies, daily Scrum meetings.
  • Develop Hive queries on external tables in order to perform various analysis.
  • Used HUE for running Hive queries. Created partitions according to data using Hive to improve performance.
  • Importing and exporting data into HDFS and HIVE using Sqoop.
  • Responsible for loading data from UNIX file systems to HDFS. Installed and configured Hive and written Hive UDFs.
  • Developed Spark Applications by using Scala and Implemented Apache Spark data processing project to handle data from various RDBMS and Streaming sources.
  • Used Spark SQL on data frames to access hive tables into spark for faster processing of data
  • Designed and developed jobs to validate the data post migration such as reporting fields from source and designation systems using Spark SQL RDDs and Data Frames/Datasets.
  • Used Spark Data Frame API to process Structured and Semi Structured files and load them into S3 Bucket.
  • Used Spark Data Frames Operations to perform required Validations in the data and to perform analytics on the Hive data.
  • Used Different Spark Modules like Spark core, Spark SQL, Spark Streaming, Spark Data sets and Data frames.
  • Creating end to end Spark-Solr applications using Scala to perform various data cleansing, Validation, transformation according to the requirement.
  • Worked on apache Solr for indexing and load balanced querying to search for specific data in larger datasets
  • Worked on Spark Streaming and Structured Spark streaming using Apache Kafka for real time data processing.
  • Responsible for developing multiple Kafka Producers and Consumers from scratch as per the software requirement specifications.
  • Designed Columnar families in Cassandra and Ingested data from RDBMS, performed data transformations, and then exported the transformed data to Cassandra as per the business requirement.
  • Experience in creating Impala views on hive tables for fast access to data.
  • Experienced in running query using Impala and used BI tools to run ad-hoc queries directly on Hadoop.
  • Involved in reading uncompressed data formats like Gzip, Avro, Parquet and compressed the same according to the business logic by writing generic code.
  • Extract Real time feed using Kafka and Spark Streaming and convert it to RDD and process data in the form of Data Frame and save the data as Parquet format in HDFS.
  • Developed workflows using Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Used the JSON and XML SerDe's for serialization and de-serialization to load JSON and XML data into HIVE tables.
  • Working experience on Cloudera Hadoop distribution version CDH5 for executing the respective scripts.
  • Worked on multiple clusters in managing the Data in HDFS for Data Analytics.
  • Gather business requirements and design and develop data ingestion layer and presentation layer.
  • Highly motivated and versatile team player with the ability to work independently & adapt quickly to the environment.
  • Performed ad-hoc queries on structured data using Hive QL and used Partition, bucketing techniques and joins with Hive for faster data access.

Environment: Hadoop 3.0, Agile, HDFS, Apache Hive 2.3, Sqoop 1.4, UNIX, Spark, Scala, Apache Solr, Kafka, Cassandra 3.11, Impala Avro, Parquet, JSON, XML, CDH5, Cloudera.

Confidential - Bellevue,WA

Sr. Spark Developer


  • Created Spark applications using Spark Data frames and Spark SQL API extensively.
  • Developed Java modules implementing business rules and workflows using Spring MVC, Web Framework.
  • Performed tuning J2EE apps, performance testing, analysis, and tuning.
  • Developed the Product Builder UI screens using Angular-JS, NodeJS, HTML5, CSS, and JavaScript.
  • Created Sqoop scripts to import user profile and sale order purchase information from Teradata to S3 data store.
  • Developed various Spark applications using Scala to perform various enrichments, aggregations and other business metrics processing click stream data along with user profile data.
  • Worked on fine-tuning spark applications/jobs to improve the efficiency and overall processing time for the pipelines.
  • Worked on development of Hibernate, including mapping files, configuration file and classes to interact with the database.
  • Installed and configured Hadoop, MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Worked extensively on the spring framework, implementing Spring MVC, Spring Security, IOC (dependency injection) and Spring AOP.
  • Used Angular JS, Bootstrap and AJAX to get the data from the server asynchronously by using JSON objects.
  • Used Oracle as Database and used Toad for queries execution and Involved in writing SQL scripts, PL/ SQL code for procedures and functions.
  • Developed Restful web services using Java, Spring Boot, NoSQL databases like MongoDB.
  • Developed classes using core java (multithreading, concurrency, collections, memory management) and some spring IOC.
  • Designed & developed the UI application using React JS, Rest, Spring MVC, Spring Data JPA, Spring Batch, Spring Integration & Hibernate.
  • Utilized Spark Scala API to implement batch processing of jobs
  • Used Broadcast variables in Spark, effective & efficient Joins, transformations and other capabilities for data processing.
  • Used Spark-SQL to perform event enrichment and to prepare various levels of user behavioral summaries.
  • Designed dynamic and browser compatible pages using HTML5, DHTML, CSS3 and JavaScript.
  • Used AngularJS and Bootstrap to consume service and populated the page with the products and pricing returned.
  • Used RESTFUL in conjunction with Ajax calls using JAX-RS and Jersey.
  • Designed and developed the Application using spring and Hibernate framework.
  • Developed Intranet Web Application using J2EE architecture, using JSP to design the user interfaces and Hibernate for database connectivity.
  • Extensively Worked with Eclipse as the IDE to develop, test and deploy the complete application.

Environment: Java, J2EE, JavaScript, HTML5, Hibernate 4.2, Hadoop 3.0, MapReduce, HDFS, Ajax, Oracle 11g, PL/ SQL, MongoDB, SQL, Scala, spark, Kafka, Storm, Zookeeper, Eclipse

Confidential - Dallas, TX

Java/J2ee Developer


  • Worked as a Java/J2EE Developer worked on middleware architecture using Java technologies like J2EE, Servlets, and application servers like Web Sphere and Web logic.
  • Implemented MVC architecture by separating the business logic from the presentation layer using spring.
  • Involved in Documentation and Use case design using UML modeling include development of Class diagrams, Sequence diagrams, and Use case diagrams.
  • Extensively worked on n-tier architecture system with application system development using Java, JDBC, Servlets, JSP, Web Services, WSDL, Soap, Spring, Hibernate, XML, SAX, and DOM.
  • Extensively used Eclipse IDE for developing, debugging, integrating and deploying the application.
  • Developed UI using HTML, CSS, Bootstrap, JQuery, and JSP for interactive cross browser functionality and complex user interface.
  • Developed Service layer interfaces by applying business rules to interact with DAO layer for transactions.
  • Used Spring Framework for MVC for writing Controller, Validations and View.
  • Provided utility classes for the application using Core Java and extensively used Collection package.
  • Used Core Spring for Dependency Injection of various component layers.
  • Used SOA REST (JAX-RS) web services to provide/consume the Web services from/to down-stream systems.
  • Developed a web-based reporting for credit monitoring system with HTML, CSS, XHTML, JSTL, Custom tags using spring.
  • Developed user interface using JSP, JSP Tag libraries and Struts Tag Libraries to simplify the complexities of the application.
  • Implemented Business Logic using POJO's and used WebSphere to deploy the applications.
  • Used the built tools Maven to build JAR & WAR files and ANT for clubbing all source files and web content in to war files.
  • Worked on various SOAP and RESTful services used in various internal applications.
  • Developed JSP and Java classes for various transactional/ non-transactional reports of the system using extensive SQL queries.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including MapReduce, Hive and Spark.
  • Implemented Storm topologies to pre-process data before moving into HDFS system.
  • Implemented POC to migrate MapReduce programs into Spark transformations using Spark and Scala.
  • Involved in configuring builds using Jenkins with Git and used Jenkins to deploy the applications onto Dev, QA environments
  • Involved in unit testing, system integration testing and enterprise user testing using JUnit.
  • Used Maven to build, run and create Aerial-related JARs and WAR files among other uses.
  • Used JUnit for unit testing of the system and Log4J for logging.
  • Worked with production support team in debugging and fixing various production issues.

Environment: Java, spring 3.0, XML, Jenkins, Hibernate 3.0, JUnit, HTML 4.0.1, CSS, Angular.JS, Bootstrap, WebSphere, Maven 3.0, Eclipse, Spark, JQuery.


Java Developer


  • Involved in the complete Software Development Life Cycle (SDLC) including Requirement Analysis, Design, Implementation, Testing and Maintenance.
  • Used core java to design application modules, base classes and utility classes.
  • Used Dependency Injection (DI) or Inversion of Control (IOC) In order to develop code for obtaining bean references in spring framework using annotations.
  • Involved in Implementation of the application by following the Java best practices and patterns.
  • Used both Java Objects and Hibernate framework to develop Business components to map the Java classes to the database.
  • Used spring framework for dependency injection, transaction management. Used Spring MVC framework controllers for Controllers part of the MVC.
  • Implemented Business Logic using POJO's and used WebSphere to deploy the applications.
  • Used Spring Framework for MVC for writing Controller, Validations and View.
  • Used Eclipse as IDE for development of the application.
  • Built data-driven Web applications with server side Java technologies like Servlets/JSP and generated dynamic Web pages with Java Server Pages (JSP)
  • Involved in mapping of data representation from MVC model to Oracle Relational data model with a SQL-based schema using Hibernate, object/relational-mapping (ORM) solution.
  • Used Spring IOC framework to integrate with Hibernate.
  • Implemented Maven Script to create JAR & dependency JARS and deploy the entire project onto the Weblogic Application Server.
  • Coded JavaBeans and implemented Model View Controller (MVC) Architecture.
  • Developed Client applications to consume the Web services based on both SOAP and REST protocol.
  • Utilized log4j for logging purposes and debug the application.
  • Created and implemented Oracle Queries, functions using SQL and PL/SQL.
  • Involved in bug fixing during the System testing, Joint System testing and User acceptance testing.
  • Worked on various SOAP and RESTful services used in various internal applications.
  • Consumed REST based Micro services with Rest template based on RESTful APIs.
  • Developed front end web application using Angular.JS along with cutting edge HTML and CSS.
  • Developed processing component to retrieve customer information from MySQL database, developed DAO layer using Hibernate.
  • Used Maven for developing build scripts and deploying the application onto WebLogic.

Environment: Java, spring, Hibernate, MVC, POJO, WebSphere, Eclipse, Maven, JavaBeans, SOAP, log4j, SQL, PL/SQL, CSS, MySQL

Hire Now