We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

MI

SUMMARY

  • Total 10 years of experience in various IT sectors such as banking, health - care.
  • Experience in installing, configuring and using Apache Hadoop ecosystem components such as HDFS, Map Reduce, Zoo Keeper, Oozie, HBASE, Hive, Sqoop, Impala, Pig and Flume.
  • 3+ years of experience as a Hadoop Developer in all phases of Hadoop and HDFS development.
  • Experience in Hadoop Administration and Linux
  • Experience on Hadoop Administration like Cluster configuration, Single Node Configuration, Multi Node Configuration, Data Node Commissioning and Decommissioning, Name Node Backup and Recovery, HBase, HDFS and Hive Configuration, Monitoring clusters, Access control List.
  • Well versed in installing, managing& upgrading distributions of Hadoop (Hortonworks, Cloudera etc.)
  • Experience in Hadoop stack, cluster architecture and monitoring the cluster.
  • Experience with HDFS, MapReduce and Hadoop Ecosystem (Pig, Hive, Oozie, Hbase, Zookeeper, Flume, and Sqoop).
  • Well versed with developing and implementing MapReduce jobs using Hadoop to work with Big Data.
  • Experience with Spark processing Framework such as Spark and Spark Sql.
  • Experience in NoSQL databases like HBase.
  • Procedural knowledge in cleansing and analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in Java.
  • Experienced in writing custom UDFs and UDAFs for extending Hive and Pig core functionalities.
  • Ability to develop Pig UDF'S to pre-process the data for analysis.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS), Teradata and vice versa.
  • Skilled in creating workflows using Oozie for cronjobs.
  • Experienced with Java API and REST to access HBase data.
  • Experience with SQL, PL/SQL and database concepts.
  • Experience with Horton works Distribution and Cloudera Distribution.
  • Experience in all stages of SDLC (Agile, Waterfall), writing Technical Design document, Development, Testing and Implementation of Enterprise level Data mart and Data warehouses.
  • Experience in core Java for Hadoop Ecosystem.
  • Experience in Tableau Reporting.
  • Have worked in Health Insurance & banking domain.
  • Have delivered projects in Agile and waterfall model.
  • Extending experience in Onsite & Offshore model.
  • Conducted multiple training session in Big data.
  • Updated with latest technology in Big data ecosystem such as Spark.
  • Good understanding of Data Mining and Machine Learning techniques.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, Map Reduce, Pig, Hive, Sqoop, Flume, Zookeeper, Spark

IDE Tools: Eclipse

Programming lang: Java, Python, Scala

Databases: Oracle, MySQL, Cassandra, HBASE

Web Technologies: HTML, XML, JavaScript

Operating Systems: Windows, UNIX

Analytics& Reporting: Tableau

Messaging Tool: Kafka, Strom

Analytics Tool: Alteryx, R

PROFESSIONAL EXPERIENCE

Confidential, MI

Hadoop Developer

Responsibilities:

  • Responsible for Installation and configuration of Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.
  • Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts.
  • Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
  • Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
  • Installing, Upgrading and Managing Hadoop Clusters
  • Administration, installing, upgrading and managing distributions of Hadoop, Hive, Hbase.
  • Advanced knowledge in performance troubleshooting and tuning Hadoop clusters.
  • Created Hive tables, loaded data and wrote Hive queries that run within the map.
  • Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources.
  • Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
  • Extensively worked on creating End-End data pipeline orchestration using Oozie.
  • Design and develop JAVA API (Commerce API) which provides functionality to connect to the Cassandra through Java services.
  • Responsible for continuous monitoring and managing Elastic MapReduce cluster through AWS console.
  • Evaluated suitability of Hadoop and its ecosystem to the above project and implementing / validating with various proof of concept (POC) applications to eventually adopt them to benefit from the Big Data Hadoop initiative.

Environment: Map Reduce, HDFS, Hive, Pig, HBase, SQL, Sqoop, Flume, Oozie, Apache Kafka, Zookeeper, J2EE, Eclipse, Cassandra.

Confidential, Mclean - VA

Hadoop Developer

Responsibilities:

  • Requirement analysis and Design.
  • Identifying segments of Data as per business values and requirement.
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Created Generic scripts to load into Hive External Table.
  • HiveQL scripts to create, load, and query tables in aHive.
  • Developed parameterized complex PIG Scripts for change data capture / delta record processing between newly arrived data and already existing data in HDFS.
  • Develop HIVE DDLs to create, alter and drop HIVE TABLES
  • Experienced in defining job flows in Control M.
  • Experienced in managing and reviewing Hadoop log files
  • Load and transform large sets of structured, semi structured and unstructured data
  • Responsible to manage data coming from different sources
  • Supported Map Reduce Programs those are running on the cluster
  • Involved in loading data from UNIX file system to HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way

Environment: Core Java, Python, Apache Hadoop(Cloudera), HDFS, Pig, Hive, Sqoop, Flume, Shell Scripting, My Sql, LINUX, UNIX

Confidential, Columbus, GA

Hadoop Developer

Responsibilities:

  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Understand complex data structures of different type (structured, semi structured) and de-normalizing for storage in Hadoop.
  • Develop solutions to ingest data into HDFS (Hadoop Distributed File System), process within Hadoop and emit the summary results from Hadoop to downstream analytical systems.
  • HiveQL scripts to analyze customer data to determine customer call/Chat report .
  • HiveQL scripts to create, load, and query tables in aHive.
  • HiveQL scripts to perform Sentiment Analysis (analyzed customer's comments and product ratings).
  • Develop PIG UDFs for the needed functionality that is not out of the box available from APACHE PIG.
  • Use Sql server to access Hive table metadata.
  • Installed and configured Hive and also written Hive UDFs.
  • Utilized Apache Hadoop environment by Cloudera.
  • Experienced in defining job flows
  • Experienced in managing and reviewing Hadoop log files
  • Load and transform large sets of structured, semi structured and unstructured data
  • Responsible to manage data coming from different sources
  • Supported Map Reduce Programs those are running on the cluster
  • Involved in loading data from UNIX file system to HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way

Environment: Core Java, Apache Hadoop (Cloudera), HDFS, Pig, Hive, Hbase,, Sqoop, Flume, Shell Scripting, My Sql, LINUX, UNIX

Confidential

ETL Engineer

Responsibilities:

  • Involved in client discussions for requirement gatherings, source to target mappings, high level designs, low level designs, design and finalization of end to end ETL process flow.
  • Created TDD (Technical Design Document) to document end to end ETL process flow for Hybrid dimension, profile dimension.
  • Created Low level design/ graph templates with overall data flow and components.
  • Client discussions/ presentations related to ETL architecture/ framework, process flow, data flow over WebEx, IDM Case study presentation.
  • Worked on performance improvement of individual graphs as well as reduction in overall cycle execution time.
  • Setup/ reviews of job dependencies, TWS spreadsheets, JSDL’s.
  • Provided UAT support during warranty period.
  • Worked with PMO office to update/ create Change Control (CC) requests.
  • Performed Performance review/ appraisal of Ap1/Ap2/Ap3 /PL grade employees.
  • Interview panelist for Level 1 & Level 2 interviews.
  • Managed WSR & MSR to customer.
  • Involved in creation of CC (change control) and pricing sheet.
  • Conducted PQI Audit and Security audit for IDM project.

Environment: Ab initio, Unix, Oracle, Teradata

Confidential

ETL Developer

Responsibilities:

  • Impact analysis of existing warehouse and the existing audit control system.
  • Creation of design documents, development of AB-I graphs, Unix scripts
  • Unit Testing and Integration Testing.

Environment: Ab initio, Unix, Oracle, Tivoli

Confidential

Automation Developer

Responsibilities:

  • Participated in Framework design Plan
  • Defining Criteria for Selecting Test Cases for Automation
  • Participated in understanding the requirements of business flow
  • Developed Hybrid Framework
  • Creation of Automation Test Scripts
  • Involved in Reviewing Test scripts
  • Created automation test sets and involved in Batch execution.
  • Involved in Uploading automation Scripts in Quality Centre.
  • Involved in KT Sessions in the Project.
  • Involved in preparing User Manual.
  • Being a SME of the automation team, actively participated in all automation KTs and business meetings.

Confidential

Test Engineer

Responsibilities:

  • Resource planning and resource utilization based on work forecast.
  • Preparation of top down and bottoms up estimates for the testing effort
  • Preparing the test Entry-Exit criteria for different phases of testing
  • Analysis of business requirements and interact/coordinate with development team to convert business requirements to test cases/test scenarios
  • Test Planning and preparation of testing schedules based on project timelines
  • Assigning resources various testing activities and preparing status reports on the various activities
  • Conducted test case peer reviews, test case TL reviews, facilitated test cases review with the client IT and business process teams

We'd love your feedback!