We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

Charlotte, NC


  • IT Professional with over 8+ years of extensive experience in all phases of SDLC, including 3+ years of strong experience working on Apache Hadoop ecosystem and Apache Spark.
  • 4 years of Experience in design, development, maintenance and support of Big Data Analytics using Hadoop Ecosystem tools like HDFS, Hive, Sqoop and Pig.
  • Experienced in processing Big data on the Apache Hadoop framework using MapReduce programs.
  • Excellent understanding and knowledge of NOSQL databases like HBase and Mongo DB.
  • Good knowledge of Hadoop ecosystem, HDFS, Big Data, RDBMS.
  • Experienced in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions and AWS.
  • Good knowledge on Hadoop, Hbase, Hive, Pig Latin Scripts, MR, Sqoop, Flume, Hive QL.
  • Experience in analyzing data using Pig Latin, HiveQL and HBase.
  • Capturing data from existing databases that provide SQL interfaces using Sqoop.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Implemented Proofs of Concept on Hadoop stack and different big data analytic tools, migration from different databases (i.e Teradata, Oracle, MYSQL ) to Hadoop.
  • Worked on NoSQL databases including HBase, Cassandra and MongoDB
  • Successfully loaded files to Hive and HDFS from MongoDB, HBase
  • Experience in configuring Hadoop Clusters and HDFS.
  • Worked extensively in Java, J2EE, XML, XSL, EJB, JSP, JSF, JDBC, MVC, Jakarta struts, JSTL, Spring2.0, Design Patterns and UML.
  • Extensive experience in Object Oriented Programming, using Java & J2EE (Servlets, JSP, Java Beans, EJB, JDBC, RMI, XML, JMS, Web Services, AJAX).
  • Excellent analytical and problem solving skills and ability to quickly learn new technologies.
  • Good communication and interpersonal skills. A very good team player with the ability to work independently.


Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, Sqoop, Apache Ignite, GitHub, Agile, Flume, Oozie, and ZooKeeper, Kafka, Kerberos

No SQL Databases: Hbase,Cassandra, MongoDB, MarkLogic

Languages: C, Python, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts, R Programming, Storm Linux

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery

Frameworks: MVC, Struts, Spring, Hibernate

Operating Systems: Sun Solaris, HP - UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata

Tools: and IDE Eclipse, NetBeans, Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD, DB Visualizer

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP


Confidential, Charlotte, NC

Hadoop/Spark Developer


  • Involved in complete Big Data flow of the application starting from data ingestion from upstream to HDFS, processing and analyzing the data in HDFS.
  • Developed Spark API to import data into HDFS from Teradata and created Hive tables.
  • Developed Sqoop jobs to import data in Avro file format from Oracle database and created hive tables on top of it.
  • Created Partitioned and Bucketed Hive tables in Parquet File Formats with Snappy compression and then loaded data into Parquet hive tables from Avro hive tables.
  • Involved in running all the hive scripts through hive, Impala, Hive on Spark and some through Spark SQL
  • Involved in performance tuning of Hive from design, storage and query perspectives.
  • Developed Flume ETL job for handling data from HTTP Source and Sink as HDFS.
  • Collected the Json data from HTTP Source and developed Spark APIs that helps to do inserts and updates in Hive tables.
  • Developed Spark scripts to import large files from Amazon S3 buckets.
  • Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
  • Provide support data analysts in running Pig and Hive queries.
  • Developed Spark core and Spark SQL scripts using Scala for faster data processing.
  • Involved in HiveQL and Involved in Pig Latin.
  • Importing and exporting Data from Mysql/Oracle to HiveQL Using SQOOP.
  • Configured HA cluster for both Manual failover and Automatic failover.
  • Designed and built many applications to deal with vast amounts of data flowing through multiple Hadoop clusters, using Pig Latin and Java-based map-reduce.
  • Specifying the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Experience in writing SOLR queries for various search documents
  • Responsible for defining the data flow within Hadoop eco system and direct the team in implement them.

Environment: HDFS, Yarn, MapReduce, Hive, Sqoop, Flume, Oozie, HBase, Kafka, Impala, Spark SQL, Spark Streaming, Eclipse, Oracle, Teradata, PL/SQL UNIX Shell Scripting, Cloudera.

Confidential, Reston, VA

Hadoop Developer


  • Processed Big Data using a Hadoop cluster consisting of 40 nodes.
  • Designed and configured Flume servers to collect data from the network proxy servers and store to HDFS.
  • Loaded the customer profiles data, customer spending data, credit from legacy warehouses onto HDFS using Sqoop.
  • Built data pipeline using Pig and Java Map Reduce to store onto HDFS.
  • Applied transformations and filtered both traffic using Pig.
  • Used Pattern matching algorithms to recognize the customer across different sources and built risk profiles for each customer using Hive and stored the results in HBase.
  • Performed unit testing using MRUnit.
  • Responsible for building scalable distributed data solutions using Hadoop
  • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster
  • Setup and benchmarked Hadoop/HBase clusters for internal use
  • Developed Simple to complex Map/reduce Jobs using Hive and Pig
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
  • Analyzed the data by performing Hive queries and running Pig scripts to study employee behavior
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs

Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig 0.10 and 0.11, JDK1.6, HDFS, Flume, Oozie, DB2, HBase, Mahout

Confidential, Hartford, CT

Sr Java Developer


  • Involved in defining the business rule according to client specific and convert them into High level Technical Design.
  • Designed entire system according to OOPS & UML by using Rational Tools.
  • Elaborated use cases, interface definition specifications in collaboration with Business.
  • Used Backend as the Oracle database & used JDBC technologies for integration.
  • Extensively used TOAD for all DB related activities & integration testing.
  • Used build and deploy scripts in ANT and UNIX shell scripting.
  • Developed User interface screens using Servlets, JSP, JavaScript, CSS, AJAX, HTML.
  • Involved in unit testing of developed business units & used the JUnit for specifics.
  • Worked along with the Development team & QA team to resolve the issues in SIT/UAT/Production environments.
  • Closely Co-ordinated with Architect, Business Analyst, business team for requirement analysis and doing development and implementation.
  • Spring Framework caching mechanism which was used to pre-load some of the Master Information.
  • Implementation of this project included scalable coding using JAVA, JDBC, JMS with Spring.
  • Developed Controller Classes, Command Objects, Action Classes, Form beans, Transfer Objects Singleton at server side for handling requests and responses from presentation Layer.

Environment: Core Java, J2EE1.5/1.6, Struts, Ajax, Rational Rose, Rational Requisite Pro, Hibernate3.0, CVS, RAD7.0 IDE, Oracle10g, JDBC, log4j, WebSphere6.0, Servlets, JSP, Junit.

Confidential, Columbus, OH

J2EE Developer


  • Implementation of the Business logic layer.
  • Implementation of the Business logic layer for MongoDB Services.
  • Development of advanced client-side web applications
  • JavaScript and HTTP knowledge
  • Implementation of the presentation layer using CSS, Tiles 3 and Jquery
  • Responsible for coding of DAO classes using Spring Model.
  • Implementing Business logic for Service Layer.
  • Responsible for Internationalization of Application
  • Responsible for coding POJO Classes and Hibernate Reverse Engineering.
  • Developing Hibernate Configuration files for MySql 5.1, Oracle 10g & 11g and MongoDB.
  • Front End Designing along with Javascript, JQuery Plugins and Ajax.

Environment: . J2E7, Spring MVC 4.0.3, JSP 2.0, Spring Security 3.1, Ajax, Jquery, Hibernate 4.2.1,JBOSS AS 7, Maven, Tiles, Facebook Api, MongoDB, Facebook App Developer, DOJO ToolKit, Javascript. XML.MongoDB

Hire Now