We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Newport Beach, CA

SUMMARY

  • A Qualified IT Professional with around 3+ years of experience as a Hadoop Developer.
  • Extensive Experience of working in Hadoop MapReduce, HIVE, HDFS, HBase, Sqoop, Oozie, Pig, Cloudera, Zookeeper, Flume and Cassandra.
  • Proficient with Apache Spark ecosystem such as Spark, Spark Streaming using Scala and Python.
  • Hands on work in analyzing data using Hive SQL, Pig Latin and custom MapReduce programs in Java.
  • Expert with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Oozie for scheduling and HBase as a NoSQL data store.
  • Well Versed on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
  • Good understanding of HDFS Designs, Daemons, federation and HDFS high availability (HA).
  • Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Experience in developing customized UDF’s in java to extend Hive and Pig Latin functionality.
  • Good experience in implementing and setting up standards and processes for Hadoop based application design and implementation.
  • Knowledge on handling Hive queries using Spark SQL that integrates with Spark environment.
  • Experience in NoSQL database MongoDB and Cassandra.
  • Experience in managing Hadoop clusters using Cloudera Manager Tool.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Extensive experience working in Oracle, DB2, SQL Server and My SQL database.
  • Hands on experience in VPN, Putty, winSCP, VNCviewer, etc.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.

TECHNICAL SKILLS

Hadoop Technologies: HDFS, MapReduce, Hive, Impala, Pig, Sqoop, Flume, Oozie, Zookeeper, Ambari, Hue, Spark

Programming: Scala, Pyspark

Operating System: Windows, Linux

Languages: SQL, PL/SQL, Shell Script

Project Management / Tools: MS Project, MS Office, TFS, HP Quality Center Tool

Databases: MySQL, Oracle 11g/10g/9i, SQL Server

NoSQL Databases: HBase, Cassandra

File System: HDFS

Reporting Tools: Tableau

IDE Tools: Eclipse, NetBeans

PROFESSIONAL EXPERIENCE

Confidential, Newport Beach, CA

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Worked with HiveQL on big data of logs to perform a trend analysis of user behavior on various online modules.
  • Responsible for working on the most cutting-edge Big Data technologies.
  • Developed Pig scripts for analyzing large data sets in the HDFS.
  • Collected the logs from the physical machines and the OpenStack controller and integrated into HDFS using Flume.
  • Designed and presented plan for POC on Impala.
  • Involved in migrating HiveQL into Impala to minimize query response time.
  • Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing Hive queries to further analyze the logs to identify issues and behavioral patterns.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
  • Imported data from mainframe dataset to HDFS using Sqoop. Also handled importing of data from various data sources (i.e. Oracle, DB2, Cassandra, and MongoDB) to Hadoop, performed transformations using Hive, MapReduce.
  • Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS using Oozie coordinator jobs.
  • Responsible for performing extensive data validation using Hive
  • Sqoop jobs, PIG and Hive scripts were created for data ingestion from relational databases to compare with historical data.
  • Involved in loading data from Teradata database into HDFS using Sqoop queries.
  • Involved in submitting and tracking MapReduce jobs using Job Tracker.
  • Involved in creating Oozie workflow and Coordinator jobs to kick off the jobs on time for data availability.
  • Used Visualization tools such as Power view for excel, Tableau for visualizing and generating reports.
  • Exported data to Tableau and excel with Power view for presentation and refining.
  • Implemented business logic by writing PigUDFs in Java and used various UDFs from Piggybanks and other sources.
  • Implemented Hive Generic UDF's to implement business logic.
  • Implemented test scripts to support test driven development and continuous integration.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: ApacheHadoop, Map Reduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Java, Linux, Maven, Teradata, Zookeeper, Tableau.

Confidential, San Diego, CA

Hadoop Developer

Responsibilities:

  • Developed data pipeline using Sqoop, Hive, Pig and Java MapReduce to ingest claim and policy histories into HDFS for analysis.
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Applied MapReduce frameworkjobs in java for data processing by installing and configuring Hadoop, HDFS.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Created Hive External tables and loaded the data in to tables and query data using HQL.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Responsible for architecting Hadoop clusters with CDH3.
  • Imported data from mainframe dataset to HDFS using Sqoop. Also handled importing of data from various data sources (i.e. Oracle, DB2, Cassandra, and MongoDB) to Hadoop, performed transformations using Hive, MapReduce.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Worked on NoSQL databases including HBase and ElasticSearch.
  • Performed cluster co-ordination through Zookeeper.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Installed and configured Hive and also written Hive UDFs.
  • Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase NoSQL database and Sqoop.
  • Developed shell script to pull the data from third party system’s into Hadoop file system.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.

Environment: Hadoop, MapReduce, HDFS, Flume, Cassandra, Sqoop, Pig, HBase, Hive, ZooKeeper, Cloudera, Oozie, ElasticSearch, Sqoop, NoSQL, UNIX/LINUX.

Confidential

Hadoop Developer

Responsibilities:

  • Obtained the requirement specifications from the SME’s, Business Analysts in the BR, and SR meetings for corporate work place project. Interacted with the Business users to build the sample report layouts.
  • Involved in writing the HLD’s along with the RTM’s tracing back to the corresponding BR’s and SR’s and reviewed them with the Business.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Installed and configured Apache Hadoop and Hive/Pig Ecosystems.
  • Installed and Configured Cloudera Hadoop CDH4 via Cloudera Manager in a pseudo distributed mode and cluster mode as a proof of concept.
  • Created Map Reduce Jobs using Hive/Pig Queries.
  • Extensively used Pig for data cleansing.
  • Developed the Pig UDF’S to pre-process the data for analysis.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig and HiveQL.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Involved in configuring Sqoop to map SQL types to appropriate Java classes.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Cluster co-ordination services through ZooKeeper.

Environment: Hadoop, Oracle, Cloudera Hadoop CDH4, HiveQL, PigLatin, MapReduce, HDFS, HBase, ZooKeeper, Oozie, Oracle, PL/SQL, SQL*PLUS, Windows, UNIX, Shell Scripting.

We'd love your feedback!