We provide IT Staff Augmentation Services!

Spark Developer Resume

Warren, NJ


  • Over 6 years of IT industry experience as a Hadoop and Java Developer, and worked during various phases of SDLC such as Requirement Gathering, Analysis, Design, Development, Construction, Testing, UAT and Maintenance with timely delivery against aggressive deadlines.
  • Over 2.5 years of experience with Hadoop (HDFS, Mapreduce), Spark, Scala, and Hadoop Ecosystem Components (Pig, Hive, Sqoop & Zookeeper).
  • Hands - on experience in creating MapReduce jobs and manipulating data and tasks in Hadoop HDFS System using Java.
  • Hands-on experience in writing Pig Latin scripts and dealing with pig commands to analyze data sets.
  • Hands on experience in installing, configuring and maintaining ecosystem components including Hadoop, Sqoop, Pig, Hive & Spark.
  • Experience in database development using SQL and PL/SQL and experience working on databases like Oracle 10g/11g/12c and SQL Server. Effectively made use of Table Functions, Indexes, Table Partitioning, Collections, Analytical functions and performance tuning.
  • Experience working on NoSQL databases including Hbase and Cassandra.
  • Experience using Sqoop to import data into HDFS from RDBMS (Oracle, Mysql, SQL Server).
  • Extensive experience in Java and J2EE technologies like Servlets, JSP, JDBC, JavaScript/CSS, Hibernate, and Junit testing.
  • Involved in writing bash shell scripting on UNIX for Sqoop and Hive to run the session’s resource manager scheduler experience. Schedule jobs and updated the scripts as per the requirement.
  • Expertise in using J2EE application servers such as JBoss and web servers like Apache Tomcat.
  • Experienced in java GUI/IDE Tools using Eclipse, NetBeans
  • Experienced in database GUI/IDE Tools using TOAD, SQL Developer and ERWin.
  • Involved in Data Extraction, Transformation and Loading (ETL process) from Source to target with experience of Informatica Power Center.
  • Worked with Git to support local and centralized version control with recording changes to set of files for future recall.
  • Effective team player and excellent communication skills with insight to determine priorities, schedule work and meet critical deadlines.


Big Data Ecosystem\ Database: Hadoop 2.x (MapReduce, HDFS), HBase 1.2.x, \ Oracle 10g/11g, MySQL 5.5, NoSql Spark 1.6.1/2.1.1 , Hive 1.2.1/2.1.0 , Pig 0.13.0/\ (HBase 1.1.2, Cassandra 2.1.0)\ 0.14.0, Sqoop 1.4.6, Kafka, Flume 1.5.2

Java Technologies \ Methodologies\: Java 1.7/1.8, AJAX, spring, Hibernate\ Agile, UML, Waterfall, Design Patterns

Programming Languages\ Operating Systems\: Scala 2.1.1, Java 1.7/1.8, Oracle PL/SQL, \ Windows XP/7/8/10, Linux Centos, Linux Python 2.7+/3.5+, Bash Shell Scripting.\ Ubuntu, UNIX\

Web Tools\ Testing API\: HTML5/CSS3/JavaScript, XML, JQuery \ JUNIT, Cucumber 1.11.0, SOAP, JSON


Confidential, Warren, NJ

Spark Developer


  • Extracted data form Kafka, and convert into DStream in Spark Streaming, and perform transformation to meet different feature requirements.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the data for building the common learner data model which got from Kafka in near real time and Persists into Hive.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Extracted historical data from offline sources to enrich the view information of real time streaming data.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Created and update Hives table for the offline data storage.
  • Loaded offline data from Hive Database to join with transformed RDD in order to generate required dataset.
  • Replaced and complimented exist Spark batch job into Spark Streaming job to enable near real time data analysis.
  • Optimizing of existing algorithms by using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Developed UDFs in Java as and when necessary to use in HIVE queries.
  • Supported and Monitored Map Reduce Programs running on the cluster.
  • Monitored logs and responded accordingly to any warning or failure conditions.

Environment: Apache Hadoop 2.6.0, HDFS, Hive 1.2.1, Java 1.8, Spark 1.6.1, Kafka, Sqoop 1.4.6, Linux Ubuntu/CentOs

Confidential, New York, NY

Big Data / Hadoop Developer


  • Participated in setting up the 50-node cluster and configured the entire Hadoop platform.
  • Migrating the needed data from Oracle into HDFS using Sqoop and importing various formats of flat files into HDFS.
  • Mainly worked on Hive queries to categorize data of different claims. Integrated the hive warehouse with HBase.
  • Written customized Hive UDFs in Java.
  • Designed and created Hive external tables with partitioning, dynamic partitioning and buckets.
  • Created HiveQL scripts to create, load, and query tables in a Hive.
  • Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
  • Supported Map Reduce Programs those are running on the cluster.
  • Used Pig to process raw metadata from S3 before storing data into final hive table.
  • Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).
  • Monitored System health and logs and respond accordingly to any warning or failure conditions.
  • Presented data and dataflow using Tableau for reusability.

Environment: Hadoop 2.6.0, HDFS, Hive 1.2.1, Map Reduce, Java 1.8, Pig, Sqoop, MySQL 5.5, Tableau

Confidential, Kenilworth, NJ

Hadoop Developer


  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Responsible for loading data files from various external sources like ORACLE, MySQL into staging area in Hive System.
  • Create and maintain Hive warehouse for Hive analysis.
  • Develop JAVA MapReduce Jobs for the aggregation and interest matrix calculation for users.
  • Involved in loading data from LINUX file system to HDFS. Use crontab to create jobs for scheduling resource manager for periodical tasks.
  • Experienced in managing and reviewing Hadoop log files. And experienced in running Hadoop streaming jobs to process terabytes of XML format data.
  • Run various Hive queries on the data dumps and generate aggregated datasets for downstream systems for further analysis.
  • Use Apache S q oop to dump the data user data into the HDFS on a weekly basis.
  • Generate test cases for the new MR jobs.
  • Prepare the data for consumption by formatting it for upload to the UDB system.

Environment: Hadoop 2.6.0, HDFS, Hive 1.2.1, Map Reduce, Java 1.8, Sqoop 1.4.6, Oracle 11g, MySQL 5.5


Java Developer


  • Responsible for requirement gathering and analysis through interaction with end users.
  • Developing Java Oracle database web application to process the ordering of quote services for customers. Programming web services code, to enable Java application to execute transactions on other web applications.
  • Coding JSP pages to process quote orders and to generate reports on web pages, used by the sales team customers.
  • Involved in creating Database SQL and PL/SQL queries and stored Procedures. Developing JavaBeans, Java Server Pages JSP, PL/SQL Procedures and Functions to perform business transactions.
  • Developed a Web service to communicate with the database using SOAP. Developed DAO (data access objects) using Spring Framework 3.
  • Wrote windows batch scripts and shell scripts to automate FTP process to deploy the web application.
  • Debugging production issues, working with support team, Quality Assurance team and other developer teams.
  • Actively involved in backend tuning SQL queries/DB script.
  • Handling user reported issues on Production and Quality/Test application servers, debugging issues with developers.
  • Worked in writing commands using UNIX, Shell scripting.
  • Working daily with my Team Lead to develop the application and to fix problems, reporting to the manager weekly.

Environment: Spring 3.2, JSP 2.0, JQuery 1.7, Servlet 3.0, DBC, Oracle 11g/SQL, JUNIT 3.8, CVS 1.2, Eclipse 4.2, DHTML


Java Developer


  • Actively involved in the Requirement gathering for the enhancements to the existing project.
  • Involved in developing design document and impact assessment documents.
  • Involved in designing some of the processes in the application that are developed by other developers.
  • Involved in coordinating with testing teams to resolve defects and provide 24/7 support for UAT.
  • Improving existing procedures and implementing new stored procedures using PL/SQL.
  • Developed business objects, request handlers and JSP’s for the Wireless Manager site using JAVA (Servlets, and Beans) and XML using JDeveloper.
  • Hibernate Transaction management is implemented for transactions.
  • Developed request handlers, beans, JSP’s and Data Objects in Java.
  • Tuning and Index creation for improved performance.
  • Create Test Plans using Quality Center by Test Director

Environment: Java 1. 7, Servlets 3.0, JSP, XML, AJAX, Hibernate, Oracle 1 0 g

Hire Now