We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume



  • IBM certified Hadoop Level - 2 and Spark Level-1 developer with 4 years of experience in Information Technology.
  • Experienced in working with Bigdata, Spark, Hadoop and bigdata ecosystem components such as Spark-Streaming, Spark SQL, HDFS, Map Reduce, Hive, Pig, Sqoop, Kafka for high performance computing.
  • In depth understanding and knowledge of Hadoop Architecture.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Experience in writing Complex SQL Queries involving multiple tables inner and outer joins.
  • Experience in performing ETL operations using Spark/Scala.
  • Flexible with Unix/Linux and Windows environments.
  • Experience in Spark, writing spark streaming, creating Datasets, Data frames from the existing datasets to perform actions on different types of data.
  • Good understanding of NoSQL databases such as HBase, Mongo DB.
  • Hands-on experience on Scala, Python, SQL and PLSQL.
  • Capable of processing large sets of structured, semi-structured and unstructured data sets.
  • Skilled on migrating the data from different databases to Hadoop HDFS and Hive using Sqoop.
  • Strong programming experience in creating Packages, Procedures, Functions, Triggers using SQL and PL/SQL.
  • Extensive technical experience in Oracle Applications R12/11i.
  • Familiarity with Agile methodology.
  • Strong SQL, ETL and data analysis skills.
  • Excellent Team Player with good problem-solving approach having strong communication, leadership skills and ability to work in time constrained environment.


Hadoop /Big Data Technologies: Hadoop, HDFS, Map Reduce, HBase, Pig, Hive, Sqoop, Spark, Kafka, Oozie

Shell Scripting/Programming Languages: SQL, Pig Latin, HiveQL / Python, Scala, Java

Web Technologies: HTML, XML, JSON

Databases/No SQL Databases: Oracle 9i/10g, PostgreSQL, MongoDB

Database Tools: TOAD, SQL Developer

IDE Tools: IntellIJ, Jupiter Notebook

Operating Systems: Unix/LINUX, Windows

Code Repositories: SVN, GIT

Tools:: Maven, Gradle, Ant

ERP skills: Oracle Applications R12 / 11i, Accounts Receivables (AR), Accounts Payables (AP)


Confidential, OH

Hadoop/Spark Developer


  • Ingested data from various data sources into Hadoop HDFS/Hive Tables and managed data pipelines in providing DaaS (Data as Service) to business/data scientists for performing the analytics.
  • Ingested data into HDFS using SQOOP from various RDBMS, CSV files.
  • Performed Data cleansing, transformations tasks using SPARK using SCALA.
  • Data Consolidation was implemented using SPARK, HIVE to generate data in the required formats.
  • ETL tasks for data repair, massaging data to identify source for audit purpose, data filtering and store back to HDFS.
  • Worked on real-time data processing using Spark Streaming and Kafka using Scala.
  • SPARK-Scala RDD s are used to transform, filter data which contains “ERROR”, “FAILURE”, “WARNING” in the log lines and then stored into HDFS.
  • Worked on different data formats such as Parquet, AVRO, Sequence File, Map file formats.
  • Uploaded data to Hadoop Hive and combined new tables with existing databases.
  • Worked on writing Scala programs using Spark on Yarn for analyzing data.
  • Worked on the use case involving NoSQL/MongoDB for faster data ingestion/update and retrieval of data.
  • Worked on writing SQL queries in retrieving data from MongoDB

Environment: Spark, Kafka, Hadoop, HDFS, YARN, Hive, Impala, Pig, Flume, Oozie, MongoDB, Sqoop, Scala, Linux,PySpark.

Confidential, CA

Big data Engineer


  • Designed and developed Big Data analytics platform for processing data using Hadoop, Hive and Pig.
  • Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured data.
  • Exporting the data using Sqoop to RDBMS servers and processed that data for ETL operations.
  • Worked hands on with ETL process and Involved in the development of the Hive scripts for extraction, transformation and loading of data into other data warehouses.
  • Load the aggregate data into a relational database for reporting, dash boarding and ad-hoc analyses.
  • Partitioning and Bucketing techniques in hive to improve the performance.
  • Designing data model on Hive and optimize Hive queries.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.

Environment: Hadoop, Hive, Pig, Sqoop, Zookeeper, HDFS

Confidential, CA

Oracle Application Technical Developer


  • Involved as a developer in Custodial Project.
  • Custodial project is designed to perform the disbursement processing and reconciliation activities for the clients.
  • Developed the Technical Design(MD070) documents for all the enhancements which I was involved based on the Functional specifications(MD050).
  • Created inbound interface to interface the check request details into Oracle AP .
  • Created the Trigger which will fire when the payment batch is confirmed.
  • Created the outbound to send the payment batch details.
  • Created XML Publisher Report to send the outstanding check details.
  • Involved as a developer in R12 Upgradation Project (Refunds).

Environment: Oracle Applications R12, Oracle Payables, PL/SQL, XML Publisher, TOAD.


Software Developer


  • Designed and developed quick solutions using design patterns.
  • Analyzed and developed solutions for legacy projects by debugging issues.
  • Supported production team by analyzing and providing requested information using PLSQL.
  • Gathered requirements by interacting with various team members to integrate services.
  • Created parsers for quick analysis of data and data extraction (XML & JSON).

Environment: Java1.6, IntellIj, Oracle11i, SQL, PLSQL


Software Developer Intern


  • Involved as software developer in Cloud Connector Framework Project.
  • Developed low cost network using Zigbee that monitor temperature.
  • Zigbee based Home automation system is developed through gateway.
  • Displayed all monitoring parameters at common point.

Environment: XCTU Tool, Matlab

Hire Now