We provide IT Staff Augmentation Services!

Big Data Developer Resume


  • Around 7 plus years of development experience using Hadoop,Java and Oracle, which includes Big Data ecosystem,design, development and administration.
  • Over 3 years of extensive experience in Big Data and excellent understanding/knowledge of Hadoop architecture and various components such as Spark SQL, HDFS, Pig, Hive, Hbase, Sqoop, Flume, Yarn, Zookeeper, Kafka and Cassandra.
  • Experience in loading structured, semi - structured and unstructured data from different sources like csv, xml files, Teradata, MS SQL Server, Oracle into Hadoop.
  • Experience in importing and exporting the different formats of data into HDFS, HBASE from different RDBMS databases and vice versa.
  • Experience in writing Scala programs.
  • Exposure on Python programming language.
  • Expertise is working with distributed and global project teams.
  • Experience in using various Hadoop distributions like Cloudera, Hortonworks.
  • Good exposure on Yarn environment withSpark, Kafka and dealing with file formats like Avro, Json, Xml and sequence files.
  • Experience writing custom UDFs in pig and hive based on the user requirement.
  • Experience in storing, processing unstructured data using NOSQL databases like Hbase, Cassandra and MongoDB.
  • Experience in writing work flows and scheduling jobs using Oozie.
  • Involved in project planning, setting up standards for implementation and design of Hadoop based applications.
  • Experience in Work independently and end to end on projects.
  • Proficiency in creating business and technical project documentation.
  • Ability to lead Team and develop a project from scratch.


Hadoop/Big Data: Apache Spark, HDFS, Map Reduce, Hive, Pig, Oozie, Flume, ZooKeeper, Scoop, Hbase, Cassandra, Spark Streaming, Kerberos, Zeppelin

NoSQL Databases: HBase,Cassandra, mongoDB

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts,Perl, Python,Scala, R

ETL: IBMWebSphere/Oracle

Operating Systems: Sun Solaris, UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle, SQL Server, MySQL, Netteza

Tools and IDE: Eclipse, NetBeans, intellij & Maveen, SBT, JDeveloper, DB Visualizer,Toad,SQL Developper.

Version control: SVN, Git, Bit Bucket



Big Data Developer

Environment Spark,SparkSQL, Hadoop-HDFS, Pig, Sqoop, Hive, Oozie, MySQL, Scala, Talend, Autosys


  • Involved in complete project life cycle starting from design discussion to production deployment.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce.
  • DevelopedSparkcore andSparkSQL scripts using Scala for faster data processing.
  • Extensively used big data analytical and processing tools Hive,SparkCore,SparkSQL for batch processing large data sets on Hadoop cluster.
  • Experienced with theSparkimproving the performance and optimization of the existing algorithms in Hadoop usingSparkContext,Spark-SQL, Data Frame, RDD's and YARN.
  • Developed Simple to complex Map Reduce Jobs using Hive and Pig
  • Import and export data between the environments like MySQL, HDFS and deploying into productions.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations beforestoring the data onto HDFS.
  • Worked on partitioning and used bucketing in HIVE tables and setting tuning parameters to improve the performance.
  • Involved in developing Impala scripts to do Adhoc queries.
  • Experience in Oozie workflow scheduler template to managevarious jobs like Sqoop, MR, Pig, Hive, Shell scripts, etc.
  • Extensively used SVN as a code repository for managing day agile projectdevelopment process and to keep track of the issues and blockers.
  • Developed the Oozie workflows with Sqoop actions to migrate the data from relationaldatabases like Oracle, Netezza, Teradata to HDFS.
  • Used Hadoop FS actions to move the data from upstream location to local data lake locations.
  • Created a common data lake for the migrated data to be used by other members of the team.
  • Written extensive Hive queries to do transformations on the data to be used by downstream models.
  • Developed map reduce programs as a part of predictive analytical model development.
  • Developed Hive queries to do analysis of the data and to generate the end reports to beused by business users.

Confidential, Atlanta, GA

Senior Hadoop/SparkDeveloper

Environment Spark,SparkSQL, Hadoop-HDFS, Pig, Sqoop, Hive, Flume, Oozie, MySQL, Scala


  • Migrating jobs from Sqoop and pig to Spark SQL for faster processing
  • Loading data intoSparkRDD and do in memory data Computation to generate the Output response.
  • Enhanced and optimized productSparkcode to aggregate, group and run data mining tasks using theSparkframework.
  • Loading data intoSparkRDD and do in memory data Computation to generate the Output response.
  • DevelopedSparkcode andSpark-SQL for faster testing and processing of data. Experiencing inExtending Hive and Pig core functionality by writing custom UDFs
  • Analyzed large data sets by running Hive queries and Pig scripts
  • Experienced in collecting, aggregating, and moving large amounts of streaming data into HDFS using Flume.
  • Load and transform large sets of structured, semi structured and unstructured data
  • Scheduling and managing cron jobs, wrote shell scripts to generate alerts.
  • Prepared design documents and functional documents.
  • Based on the requirements, addition of extra nodes to the cluster to make it scalable.
  • Involved in running Hadoop jobs for processing Billions of records of text data
  • Involved in loading data from local file system (LINUX) to HDFS
  • Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
  • Assisted in exporting analyzed data to relational databases using Sqoop
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing

Confidential, San Francisco, CA

Hadoop Developer

Environment Hadoop, Hive, Impala, Oracle, Spark, Scala, Pig, Netezza, Sqoop, Oozie, Version one, Shell.


  • Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
  • Expert in implementing advanced procedures like text analytics and processing using the in-memory
  • Involved in analysis, design, testing phases and responsible for documenting technical specifications.
  • Computing capabilities like Apache Spark written in Scala.
  • Experience in Hadoop distributed file system Cloudera.
  • Developed and executed shell scripts to automate the jobs
  • Wrote complex Hive queries and UDFs.
  • Worked on reading multiple data formats on HDFS using Spark.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Developed multiple POCs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Analyzed the SQL scripts and designed the solution to implement using Spark
  • Automation Tool (Autosys) for scheduling oozie jobs based on calendar and file watcher jobs.
  • Facilitating the dailyscrummeetings, sprint planning, sprint review, and sprint retrospective.
  • Worked on the core and Spark SQL modules of Spark extensively.


Software Engineer.

Environment Core Java, JSP, Servlet, JSTL1.0, CVS, JavaScript, and Oracle9i, SQL, PL/SQL, Triggers, Stored Procedures, WebLogic, Eclipse.


  • Responsible for understanding the scope of the project and requirement gathering.
  • Developed the web tier using JSP, Struts MVC to show account details and summary.
  • Created and maintained the configuration of the Spring Application Framework.
  • Implemented various design patterns - Singleton, Business Delegate, Value Object and Spring DAO.
  • Used Spring JDBC to write some DAO classes which interact with the database to access account information.
  • Mapped business objects to database using Hibernate.
  • Involved in writing Spring Configuration XML files that contains declarations and other dependent objects declaration.
  • Used Tomcat web server for development purpose.
  • Involved in creation of Test Cases for Unit Testing.
  • Used Oracle as Database and used Toad for queries execution and also involved in writing SQL scripts, PL/ SQL code for procedures and functions.
  • Used CVS, Perforce as configuration management tool for code versioning and release.
  • Developed application using Eclipse and used build and deploy tool as Maven.
  • Used Log4J to print the logging, debugging, warning, info on the server console.


Software Developer

Environment Core Java, JSP, Servlet, JSTL1.0, CVS, JavaScript, and Oracle9i, SQL, PL/SQL, Triggers, Stored Procedures, WebLogic, Eclipse.


  • Involved in the analysis, design, implementation, and testing of the project
  • Designed the functional specifications and architecture of the web-based module using Java Technologies.
  • Created Design specification using UML Class Diagrams, Sequence & Activity Diagrams
  • Developed the Web Application using MVC Architecture, Java, JSP, and Servlets & Oracle Database.
  • Developed various Java classes, SQL queries and procedures to retrieve and manipulate the data from RDBMS
  • Backend Oracle database using JDBC.
  • Extensively worked with Java Script for front-end validations.
  • Analysis of business requirements and develop system architecture document for the enhancement project.
  • Provided Impact Analysis and Test cases.
  • Involved writing the JDBC connectivity code to interact the back end data base.
  • Implemented the presentation layer with HTML, XHTML and JavaScript
  • Designed tables and indexes.
  • Wrote complex SQL queries and stored procedures.
  • Designed tables and indexes and involved in writing the DAO interaction layer as per the requirements.
  • Designed, Implemented, Tested and Deployed Enterprise application using WebLogic as Application Server.
  • Involved in fixing bugs and unit testing with test cases using Junit.
  • Actively involved in the system testing.
  • Involved in implementing service layer using Spring IOC module.

Hire Now