We provide IT Staff Augmentation Services!

Hadoop Engineer Resume

2.00/5 (Submit Your Rating)

Ann Arbor, MI

SUMMARY:

  • Having 13 years experience in software design and development using Hadoop Big Data Echo Systems and Core JAVA/J2EE with Telecommunication, Banking and Finance and Health Care domains.
  • 5/13 years of IT experience in Analyzing, Designing, Developing, Implementing and Testing of Software Applications in Big Data Analytics and development using HDFC, Map Reduce,PIG, HIVE, Sqoop, Spark Core, Spark Streaming, Spark SQL, HBase, KAFKA, Nifi , Zoo Keeper and Kerberos security.
  • 8/13 years of IT experience in software design and development using Core JAVA, SPRING Core, Spring AOP, Spring Hibernate, JDBC, XML in UNIX operating system.
  • Excellent knowledge on Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Excelent knowledge on Spark Echo System (Spark Executors, Spark Cores and Spark Jobs)
  • Involved in Data Ingestion to HDFS from various data sources.
  • Strong experience on Hadoop distributions like HortonWorks.
  • Experience in designing and developing applications in Spark using Java API
  • Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase.
  • Experienced in developing Nifi data flow processors that work with different file formats like Text, JSON, Parquet and Avro.
  • Extensive Experience on importing and exporting data using stream processing platforms like Kafka.
  • Developed multiple Kafka Producers and Consumers from scratch as per the business requirements.
  • Executed HIVE Commands for reading, writing, and managing large datasets residing in distributed storage.
  • Analyzed large data sets by running Hive queries
  • Importing and exporting the data from relational databases, NO SQL DB’S using Sqoop.
  • Managing large set of hosts, Co - coordinating and managing a service in a distributed environment using Zookeeper.
  • To improve flexibility, great performance and for big data horizontal scaling, used column based HBASE NOSQL database.
  • Ability to analyze different file formats Avro and Parquet.
  • Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
  • Strong experience in Core Java, used Concurrency Collection and Concurrency Threads to implement high performance and complex logics to strive good performance solutions.
  • Very strong experience in Object Oriented Design Principles to achieve high cohesive and loose coupling architecture.
  • Excellent hands on experience on applying Core Java design patterns for the appropriate problem.
  • Has good experience on REST webservices.
  • Used Spring Core nad AOP and Hiber nate template to strive the Dependency Injection benefits and other from Spring ramework.
  • Extensive exposure to all aspects of Software Development Life Cycle (SDLC) i.e. Requirements Definition for customization, Prototyping, Coding, Testing and Delivereis.
  • Experienced in using agile approaches, including Water fall, Test-Driven Development and Agile Scrum

TECHNICAL SKILLS:

Programming Languages: Core JAVA (Collections, Multi-Threading, Concurrency API, Strings, Memory management, Serialization, Thread Executors, Design Principles and Design Patterns), SCALA, SERVLETS and JSP

Web Services/Servers: REST Services, WebLogic and Tomcat

Frameworks: Spring core, Spring IOC, AOP and Hibernate Template

Architecture: Object Oriented Design Principles, Object Oriented Design Patterns, UML Notations, UMLet, IBM Exceed, HLD and LLD.

BIG DATA /HADOOP ECO System: Hadoop, HDFS, Map Reduce, Pig, Hive, HBase, Sqoop, Zoo Keeper, SPARK, OZIE, AZKABAN

Scripting Languages: Shell Scripting

Databases: SQL/MYSQL, HBASE and MongoDB, Hibernate (ORM)

Building Tools: Ant, Maven, JENKINS

Development Tools: Eclipse, Notepad++, Net BeansCode

Repository Tools: GITS HUB, SVN and Clear Case, Confluence (For Project Info)

Data Formats: XML, JSON, AVRO and Parquet

Methodologies: Agile Scrum (Scrum, Grooming, Sprint Planning, Daily, Review, Demo), Waterfall

Domains: Tele communication, Banking and Finance and Health care

PROFESSIONAL EXPERIENCE:

Confidential, Ann Arbor, MI

Hadoop Engineer

Environment: Java 8, REST services, Hortonworks Ambari, Hadoop, Spark Core, Spark Sql, Spark Streaming, Kafka, Nifi, Hive, HBase and Kerberos security.

Responsibilities:

  • Implemented Spark core jobs to load data, to apply transformations & actions and storing result into HDFS, Hive.
  • Implemented Spark Jobs for structured streaming data by using Data Sets and applied filters, joins and sql queries.
  • Implemented Spark jobs for Kafka streaming data and converted into data sets and applied necessary spark transformations and actions.
  • Ingested data from Active MQ to Apache Nifi for further processors.
  • Ingested store transactional data from NFS file system and extracted required columns from CSV files, applied AVRO and ORC formats to efficient usage of memory and to improve the process time.
  • Ingested Conversational messages from REST end-point to Nifi and stored into Hive.
  • Created Hive tables using Partitions and Buckets for conversational messages.
  • Created HBase tables, Columns families and Cells to persist data during ingesting data from NFS system.
  • Applied Kerberos security from Hortonworks Ambari for SPARK cluster.
  • Involved in Spark Eco System designing and contributed in Spark Executors and Core memory calculations.
  • Simulated JSON data using JMeter for load test on cluster.
  • Implemented Kafka Producers and Consumers using Java.
  • Implemented synchronous and asynchronous Kafka Producers with and without call back functions.
  • Involved in installation and configurations of Zeppelin and other eco system components using Hortonworks Ambari.

Confidential, CA

Hadoop Solution Architect/ Java Developer

Environment: SPARK, Map Reducer, PIG Scripts, Azkaban, HBASE, JAVA8, REST Services and Shell script

Responsibilities:

  • Involved in writing Map Reduce Jobs for applying proprietary algorithms and processing data.
  • Involved in configuring Hadoop cluster environment.
  • Creating Azkaban work flows, configuring and executing them.
  • Creating Azkaban projects and executing with workflows.
  • Implementing Ajax API to invoke Azkaban workflows.
  • Creating and Configuring and executing Jenkins Jobs for Azkaban projects.
  • Create and Run the pig scripts for identifying malicious activity operations.
  • Analyzed large data sets by running Hive queries and Pig scripts to identify various theft card rates.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries.
  • Involved in loading data from LINUX file system to HDFS.
  • Involved in running Hadoop jobs for processing millions of records of network data.
  • Loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Loading processed data into HBase for further actions.
  • Involved in preparing High level and low-level design documents for the RESTful web services used in this application.
  • Worked on implementing Spark to ingest the data in real time and apply transformations in Java API
  • Involved in creating Hive tables, loading the data and writing hive queries that will run internally in a map reduce way.
  • Imported data using Sqoop to load data from Oracle to HDFS on regular basis.
  • Created HBase tables to store variable data formats coming from different portfolios.
  • Used HBase thrift API to implement Real time analysis on HDFS system.
  • Developed join data set scripts using HIVE join operations.

Confidential

Java Developer/ Big Data Engineer

Environment: JDK1.7, Solaris OS, UNIX Shell Scripts, XML, Eclipse, Clear Case, HDFC, Map Reducer, PIG, FLUME, Mongo DB, SPARK Core

Technical Responsibilities:

  • Involved requirements gathering, understanding and grooming with the team.
  • Involved in preparing High level and low-level design documents for data connect application.
  • Design and recommend the Object-Oriented Design Principles into application usecases.
  • Applying Core Java Design Patterns where ever its required for the user stories.
  • Implementation of user stories in Core Java using Multi-threading and concurrency collections.
  • Created Immutable classes as part of user stories which needs to traverse through the networks.
  • Design and Implemented REST API to the data connect users.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved Implementing POCs using Spark shell commands to process the huge data and compare the process time.
  • Implemented data injection process using flume sources, flume consumers and flume interceptors
  • Validated the performance of Hive queries on Spark against running them traditionally on Hadoop
  • Involved in Testing and coordination with business in User testing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Written Spark shell commands to parse the network data and structure them in tabular format to facilitate effective querying on the entities.
  • Involved in creating Hive tables loading data and writing queries that will run internally in MapReduce way.
  • Used Pig tool to do transformations, event joins, filter and some pre-aggregations.
  • Involved in processing ingested raw data using MapReduce, Apache Pig and HBase.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved in scheduling Azkaban workflow engine to run multiple Workflows and pig jobs.
  • Used Hive to analyze the partitioned and bucketed data to compute various metrics for reporting.

We'd love your feedback!