We provide IT Staff Augmentation Services!

Hadoop And Spark Developer Resume

2.00/5 (Submit Your Rating)

Atlanta, GA

PROFESSIONAL SUMMARY:

  • Around 8+ Years of IT experience out of which 6 years in Big Data Analytics in Telecommunications, Insurance, Retail and 2+ years in Data Warehousing.
  • Hands on experience in Hadoop Technologies such as HDFS, HIVE, PIG, SPARK, SQOOP, Impala, Flume, Pentaho Kettle, Kafka.
  • Hand on experience with Cloud technologies like Deep.io, Rabbit IQ, NiFi (transformations) and AWS S3 buckets
  • Good experience with importing and exporting data from different systems to Hadoop file system using SQOOP.
  • Experience in using Hadoop ecosystem components for storage and processing data, exported data into Tableau using live connection.
  • Good experience in creating databases, tables and views in HIVEQL, IMPALA and PIG LATIN.
  • Hands on experience in in - memory data processing with Apache Spark.
  • Proficiency in Spark using Scala for loading data from the local file systems like HDFS, Amazon S3, Relational and NoSQL databases using Spark SQL, Cassandra and Import data into RDD and Ingesting data from a range of sources using Spark Streaming.
  • Hands on experience in writing MapReduce jobs in Hive, Pig and Efficient in building hive, pig and MapReduce scripts.
  • Developed Apache Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
  • Strong knowledge on Hadoop and Hive’s analytical functions.
  • Implemented Proof of Concept on Hadoop stack and different big data analytic tools, migration from different databases (i.e. Oracle, MySQL) to Hadoop data lake.
  • Hands on experience in using OOZIE to build and schedule Hadoop workloads.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS, experience working with FLUME to handle large volume of streaming data ingestion and transferring large data sets from HDFS to relational database (vice versa) using SQOOP.
  • Working knowledge on Spark, NoSQL databases like HBase, Cassandra.
  • Hands on experience with RDBMS, including writing complex SQL Queries, Stored procedure and triggers.
  • Experience with CI/CD tools set like Git, Gerrit, Jenkins.
  • Worked with Agile tools like Jira, Confluence etc.
  • Experience in SQL and PL/SQL.
  • Knowledge of java virtual machines (JVM) and multithreaded processing.
  • Experience in Java programming.
  • Strong Communication, interpersonal and presentation skills.

TECHNICAL SKILLS:

Hadoop Framework: HDFS, MapReduce, Apache Spark, Java, Cloudera, Hive, Pig, HBase, Apache Spark, Spark SQL, SQOOP, Flume, Storm, Kafka, Impala.

Databases: HiveQL, Impala, Oracle 11/10g/9i.

Languages: SQL, PL/SQL, Core Java.

Tools: TOAD, Oracle SQL Developer.

Operating Systems: UNIX, Linux, Microsoft Windows 95/98/00/NT/XP, MS-DOS.

IDE Tools: Eclipse.

Web Technologies: HTML, DHTML, JavaScript, JSP, CSS and XML.

Servers: Apache Tomcat

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Hadoop and spark Developer

Responsibilities:

  • Contributed towards architecture and building of initial framework for the EDS Data lake project.
  • Work on data integration and ingestion from SAP,Oracle, SQL Server source systems and EDW into Hadoop.
  • Participated in Agile project development lifecycle using Git, Geritt and Jenkins for CI/CD process.
  • Worked on setting up key components for the project like Kerberos authentication renewals, password encryption mechanism in Hadoop and creation of environment profiles for ease of code deployments to higher environments.
  • Worked on data modeling and design of Hive and HBase Table structures based on the project reporting and analytic needs.
  • Developed shell scripts and Spark SQL jobs to handle large volumes of ETL workloads.
  • Worked on development and implementation of incremental data (CDC) loads from source systems into Hadoop using Apache Spark SQL.
  • Worked extensively with Hive and HBase for data validation and analysis.
  • Designed Oozie workflows and coordinators to enable scheduling and automation of ETL jobs.
  • Worked with utilities like TDCH to load data from Teradata into Hadoop.
  • Worked with Sqoop, Flume and Pig.
  • Worked on projects involving both on-prem and cloud data integration.
  • Developed processes to integrate events data from Deep.io, Rabbit MQ and NiFi (transformations) and finally load to AWS S3 buckets.

Environment: Apache Hadoop, Pig, Hive, SQOOP, Spark, Spark Streaming, Spark SQL, Kafka, MapReduce, HDFS, LINUX, Oozie, Hue, AWS, NiFi, Rabbit MQ, RHEL.

Confidential, San Francisco, CA

Hadoop and spark Developer

Responsibilities:

  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala
  • Installed Hadoop, Map Reduce, and HDFS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Analyzed Hadoop clusters, other analytical tools used in big data like Hive, Pig and databases like HBase.
  • Worked on analyzing/transforming the data with Hive and Pig.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Scala scripts, UDFFs using both Data frames/SQL and RDD/MapReduce in Spark 1.5 for Data Aggregation, queries and writing data back into OLTP system directly or through Sqoop.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Import the data from different sources like HDFS/Hbase into Spark RDD.
  • Performed real time analysis on the incoming data.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
  • Involved in loading data from LINUX file system to HDFS.

Environment: Cloudera CDH5, Spark, Hive, Pig, Oozie, Spark Streaming, Spark SQL, SQOOP.

Confidential, San Ramon, California

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs for data cleaning and preprocessing.
  • Involved in importing data from MySQL to HDFS using SQOOP.
  • Worked on analyzing/ transforming the data with Hive and Pig.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Successfully loaded files to Hive and HDFS from traditional databases.
  • Gained good experience with NOSQL database like HBase.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Experienced in managing and reviewing Hadoop log files .
  • Involved in writing Hive queries to load and process data in Hadoop File System.
  • Experience in writing custom UDFs for Hive and Pig to extend the functionality.
  • Exported data from Impala to Tableau reporting tool, created dashboards on live connection.
  • Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.

Environment: Cloudera CDH4.3, Hadoop, MapReduce, HDFS, Hive, pig, Java (jdk1.6), Impala, Tableau

Confidential

Data Warehousing Developer

Responsibilities:

  • Software Trainee.
  • Created various SQL Queries for the application.
  • Created PL/SQL Packages, Procedures, Functions and Triggers required for the system.
  • Debugged and maintained various Packages and Procedures required for the system.
  • Involved in requirement analysis and User Documentation for the Application and created BRD for the current project.
  • Debugged and Developed several Packages and Procedures required for the system.
  • Involved in maintaining the quality system during the life cycle of the project.
  • Involved in Manual Testing for the successful execution of data.
  • Involved in creating technical documentation.
  • Involved in Support and maintenance of the System.

Environment: Windows XP, Oracle 9i, PL/SQL, HTML, Java, Unix, Linux, SQL*Plus Reports, SQL *Loader, Toad.

We'd love your feedback!