We provide IT Staff Augmentation Services!

Hadoop & Spark Developer Resume

Malvern, PA


  • Having 3 - years of professional experience this includes Analysis , Design , Development , Integration , Deployment and Maintenance of quality software applications using Java / J2EE Technologies and Hadoop technologies.
  • Hands on experience in using various Hadoop distributions (Apache, Horton works, Cloudera, MapR).
  • Developed Spark scripts by using Scala shell commands as per the requirement
  • Experience in migrating data by using SQOOP from HDFS to Relational Database System and vice-versa according to client's requirements.
  • Had good working experience on Hadoop architecture, HDFS, Map Reduce and other components in the Cloudera - Hadoop echo system.
  • Real time streaming the data using Spark with Kafka.
  • Good knowledge on building Apache spark applications using Scala.
  • Experience in developing data pipeline by using Kafka to store the data into HDFS.
  • Involved in moving all log files generated from various sources to HDFS and Spark for further processing.
  • Used Cassandra CQL with Java API’s to retrieve the data from Cassandra tables.
  • Experience working with Spring and Hibernates frameworks in JAVA.
  • Experience in developing web page interfaces using HTML, JSP and Java Swings scripting languages.
  • Good at Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC.
  • Wrote Web Services using SOAP for sending and getting data from the external interface.
  • Built REST API end-points for various concepts.
  • Created Components using JAVA, Spring and JNDI.
  • Prepared Spring deployment descriptors using XML.
  • Good understanding and working experience on Cloud based architectures.
  • Working Knowledge of other programming languages like C, C++ and Markup Languages like XML, HTML 5.
  • Good understanding and experience with Software Development methodologies like Agile and Waterfall


Programming Languages: C, C++, Java, JavaScript, Python, Scala

Operating System : Windows, Linux, UNIX

Database: MySQL, Oracle

No-SQL Database: HBase, Cassandra and MongoDB

Big Data Technologies: HDFS, MapReduce, YARN, Pig, Hive, Sqoop, Kafka, Flume, HBase, Cassandra, MongoDB, Spark, Solr, Impala, Oozie and Zookeeper

Hadoop Distributions: Cloudera, Hortonworks, MapR and Apache

Cloud Platforms : AWS Cloud, Google Cloud

Application Servers : Web Logic, Web Sphere

ETL Tools: Talend, Informatica

Built Tools : Maven, Jenkins

Frameworks: Struts, Spring, Hibernate, Spring Boot, Micro-services

Development Methodologies : Waterfall, Agile/Scrum


Confidential - Malvern, PA

Hadoop & Spark Developer


  • Loading the data from the different Data sources like (DB2, Oracle and flat files) into HDFS using Sqoop and load into Hive tables, which are partitioned.
  • Created different pig scripts & converted them as a shell command to provide aliases for common operation for project business flow.
  • Developed data processing applications by using Scala and Python and implemented Apache Spark Streaming from various streaming sources like Kafka and Flume.
  • Replaced the existing MapReduce programs into Spark application using Scala.
  • Created HBase tables and column families to store the user event data.
  • Created HBase tables to store variable data formats of data coming from different applications.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Worked on Transporting data from HBase to Hive using mapreduce and Hive-HBase storage handlers. Written automated HBase test cases for data quality checks using HBase command line tools.
  • Implemented receiver based approach, here I worked on Spark streaming for linking with Streaming Context using java API and handle proper closing & waiting for stages as well.
  • Worked on Sqoop extensively to ingest data from various source systems into HDFS.
  • Analyzed substantial data steps using Hive queries and Pig scripts.
  • Solved small file problem using Sequence files processing in MapReduce
  • Integrated multiple sources of data (SQL Server, DB2, MySQL) into Hadoop cluster and analyzed data by Hive-HBase integration.
  • Worked on Hive and Pig core functionality by using Custom User Defined function's (UDF's).
  • Developed Hive Queries to analyze the data in HDFS to identify issues and behavioral patterns.
  • Oozie and were used to automate the flow of jobs and coordination in the cluster respectively.
  • Kerberos security was implemented to safeguard the cluster.
  • Used Rally for task/bug tracking.
  • Used GIT for version control.

Environment: MapR, Hadoop, HBase, HDFS, PIG, Hive, Drill, SparkSQL, MapReduce, Spark streaming, Kafka, Flume, Sqoop, Oozie, Kafka, Spark, Scala, Talend, Shell Scripting, Java, Python, Rally, GIT.


Java Developer


  • Involved in Analysis, design and coding on JAVA/JSP Front End Environment.
  • Responsible for developing use cases, class and sequence diagram for the modules using UML and Rational Rose Enterprise edition as a Feature owner.
  • Designed the dynamic stress reporting C++.
  • Developed application using Spring, Servlets, JSP and EJB.
  • Implemented MVC (Model View Controller) architecture.
  • Designed the Application flow using Rational Rose.
  • Extensively involved in release/deployment related critical activities.
  • Tested the entire application using JUnit and JWebUnit.
  • Used web servers like Apache Tomcat.
  • Implemented Application prototype using HTML, CSS and JavaScript.
  • Developed the user interfaces with the spring tag libraries.
  • Developed, build and deployment scripts using Apache ANT to customize WAR, EAR and EJB jar files.
  • Prepared field validation and on-scenario test cases using Junit and testing of the module in 3 phases named unit testing and system using testing and regression testing.
  • Code and unit test according to client standards.
  • Used Oracle Database for data storage and coding stored procedures, functions and Triggers.
  • Wrote DB queries using SQL for interacting with database.
  • Design and develop XML processing components for dynamic menus on the application.
  • Created Components using JAVA, Spring and JNDI.
  • Prepared Spring deployment descriptors using XML.
  • Good Knowledge on C++.
  • Problem Management during QA, Implementation and Post- Production Support.

Environment: Java, HTML, Spring, JSP, Servlets, C++, BMS, Web Services, JNDI, JDBC, Eclipse, Web sphere, XML/XSL, Apache Tomcat, TOAD, Oracle, MySQL, JUNIT, SQL, PL/SQL, CSS.

Hire Now