We provide IT Staff Augmentation Services!

Sr. Hadoop / Spark Developer Resume

2.00/5 (Submit Your Rating)

SUMMARY

  • 7 years of extensive experience as IT professional in both technical and project management roles.
  • 3+ years experience in Hadoop/Spark platform as Developer in developing codes and modules to address customer needs using Hive, Sqoop, Oozie and various Hadoop components.
  • Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in designing and developing POCs in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa according to client's requirement.
  • Experience in data analysis using HiveQL, Pig Latin & HBase
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra and MongoDB.
  • Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
  • Preparation of Standard Code guidelines, analysis and testing documentations.

TECHNICAL SKILLS

Big Data/Hadoop Technologies: HDFS, YARN, MapReduce, Hive, Pig, Impala, Sqoop, Flume, Spark, Kafka, Zookeeper, and Oozie.

NO SQL Databases: HBase

Languages: Java, Scala

Cluster Management: Ambari

3rd Party Tools: SQL Developer, Toad, Source Tree, Git, Altova XMLSpy, WinSCP, Putty, Hue

Databases: DB2, Oracle, Mysql

Operating Systems: UNIX, Windows, LINUX

PROFESSIONAL EXPERIENCE

Confidential

Sr. Hadoop / Spark Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Experience in Job management using Fair scheduler and Developed job processing scripts using Oozie workflow.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kafka in near real time
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Scala scripts using both Data frames/SQL and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.

Hadoop Developer

Confidential

Responsibilities:

  • Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive and MapReduce.
  • Consumed the data from Kafka queue using spark.
  • Configured different topologies for spark cluster and deployed them on regular basis.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Involved in loading data from LINUX file system to HDFS
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
  • Used Reporting tools like Tableau to connect to Hive ODBC connector generate daily reports of data.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
  • Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
  • Responsible for loading data files from various external sources like ORACLE, MySQL into staging area in MySQL databases.
  • Actively involved in code review and bug fixing for improving the performance.
  • Involved in development, building, testing, and deploy to Hadoop cluster in distributed mode.
  • Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
  • Processed the raw data using Hive jobs and scheduling them in Oozie
  • Good Experience with apache storm using HortonWorks cluster.
  • Created HBase tables to store various data formats of incoming data from different portfolios.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Experience in Daily production support to monitor and trouble shoots Hadoop/Hive jobs

Environment: Hadoop, HDFS, Pig, Hive, Sqoop, Kafka, Apache Spark, Shell Scripting, HBase, Kerberos, Zoo Keeper, Ambari, Horton Works, MySQL.

Mainframe Developer

Confidential

Responsibilities:

  • Improved user satisfaction and adoption rates by designing, coding, debugging, documenting, maintaining and modifying a number of apps and programs for online banking. Participated in Hadoop training and development as part of a cross-training program.
  • Worked on different Mainframe related technologies such as COBOL, JCL, DB2 & CICS.
  • Led the new banking application that is used to create new set of contracts in the banking system.
  • Prepared use cases, designed and developed object models and class diagrams.

We'd love your feedback!