We provide IT Staff Augmentation Services!

Business Analytics Platform Hadoop Developer Resume

SUMMARY:

  • 4 years of overall IT experience and 4 years of comprehensive experience in Hadoop ecosystem tools and related technologies.
  • Excellent understanding of Hadoop architecture and its components and MapReduce programming paradigm.
  • Experience in analyzing data using HiveQL, Pig Latin and custom MapReduce programs in Java.
  • Working experience on designing and implementing complete end - to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie and has a good knowledge on Hbase and zookeeper.
  • Worked on a proof of concept to migrate Mapreduce and Pig code to Spark
  • Experience in working with different data sources like csv files, xml files, Json files, Oracle/Mysql to load data into Hive tables.
  • Experience in implementing User Defined Functions for Pig and Hive.
  • Involved in writing shell scripts for Unix OS for application deployments to production region.
  • Involved in maintaining hadoop cluster in development and test environment
  • Good knowledge in mining the data in hadoop file system for business insights using Hive, Pig
  • Expertise in Relational Database design, data extraction, data transformation from data sources using MySql and Oracle
  • Good working knowledge on Eclipse IDE for both developing and debugging java applications.
  • Involved in code Performance improvement and Query tuning activities.
  • Working experience on ETL tool Pentaho Data Integration
  • Responsible for designing and developing reports using Pentaho User Console.
  • Solved production issues and production database issues.
  • Interacted well with clients for clarifying issues & requirements.
  • Leadership skills include ability to lead and motivate co-workers from all backgrounds, creative problem solving and in-depth proficiency with new technology trends.

TECHNOLOGIES:

Hadoop/Big Data Ecosystem: HDFS, MapReduce, Hive, Pig, Sqoop, Oozie, Spark, Kafka

NoSQL Databases: Apache HBase, Cassandra

Databases: Oracle, MySQL, Java Development

IDE’s: Eclipse

Programming Languages: Java, C++, C

Web Technologies: HTML, JavaScript, XML

Scripting Languages: Shell Scripting, JavaScript

Logging API: Log4j

Reporting Tool: Pentaho, Talend

Tools: JUnit,Putty, WinSCP, FileZilla

Operating Systems: Windows, UNIX, Linux and Ubuntu

Version Control Systems: SVN

Build Tools: Maven, Jenkins

PROFESSIONAL EXPERIENCE:

Business Analytics Platform Hadoop Developer

Confidential

Project Skills: Hadoop, MapReduce, HIVE, Pig, Oozie, Sqoop, Hue, Spark, Cloudera, Java, JSON, XML, MySql, UNIX Shell Scripting, Log4j, Maven, Jenkins, SVN

Responsibilities:

  • Extensively made use of Cloudera CDH 4 and CDH 5 distributions
  • Designed & developed several use cases
  • Created a Pig Scripts to load nested json data to HDFS
  • Created several User Defined Functions in Pig and Hive
  • Created Oozie workflows to streamline the data flow
  • Creating shell scripts to load the raw data to HDFS
  • Created pig scripts and map reduce programs to filter the log files and aggregate the data
  • Loading log files data to Hive
  • Using Sqoop to move data between HDFS and MySql
  • Developed a spark code to migrate existing mapreduce code and pig scripts as part of proof of concept
  • Unit testing the application
  • Involved in making very important and major enhancements to the already existing Map Reduce programs and Pig scripts
  • Used SVN i.e. Subversion for version control and maintained different versions based on the release
  • Technical documentation of every design and development detail of each use case
  • Administered and Maintained the hadoop clusters in Development and Test environments
  • Provided support and maintenance
  • Provided business insights on purchase patterns during promo periods by mining data in Hive
  • Worked on a proof of concept to extract petrol prices with latitude and longitude values received from the mobile GPS.

Hadoop Developer

Confidential

Project Skills: Hadoop, MapReduce, HDFS, Hive, Sqoop, MapR, Java (jdk1.6), XML, Oracle 11g/ 10g, UNIX Shell Scripting, Pentaho Data Integration, Pentaho User Console, Pentaho Schema Workbench

Responsibilities:

  • Contribution to Design of Application and Database Architecture.
  • Documenting design document and conducting various POC’s to validate the design
  • Developing a map reduce program to validate the raw data before loading it to database for analysis for the required columns and data format.
  • Loading data from Linux file system to HDFS using Pentaho Data Integration
  • Creating data flows in Pentaho Data Integration for aggregating the data and loading the data to Hive tables
  • Moving the data to Oracle tables from Hive using Sqoop for reporting
  • Creating Bar graphs, Heat maps and Geo Maps from aggregated data using Pentaho User Console.
  • Conducting Sessions for the clients, testing team on using the application.
  • Completed Unit Testing and Integration Testing.
  • Documenting the user manual and troubleshooting guides

Confidential

Hadoop Developer

Project Skills: Hadoop, Hive, Sqoop, Oracle, Java, Shell script, Pentaho Data Integration

Responsibilities:

  • Worked on MapR distribution.
  • Migrated the needed data from Oracle into HDFS using Sqoop and importing various formats of flat files into HDFS
  • Used Shell script to pull data from different structured folders into one place folder to process by PIG.
  • Creating Workflows in Pentaho Data Integration
  • Loading the data into Hive and using HQL to derive required metrics based on the requirement given by the client
  • Moving the data from aggregated Hive tables to Oracle tables
  • Producing an excel report from the resultant oracle table
  • Created user manual and troubleshooting guides
  • Involved in User

Systems Engineer, Intern

Confidential

Project Skills: Cassandra, Java, JSF, HTML

Responsibilities:

  • Loading data to Cassandra and Making updates
  • Using Java API’s to communicate with Cassandra
  • Creating a simple web application using Java and JSF
  • Serializing and De-Serializing Objects to load the data as Byte Array to Cassandra
  • Documenting the Design and Development details

Hire Now