We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Bentonville, AR


  • Over 7+ years of experience in Information Technology involving Analysis, Design, Testing, Implementation and Training. Excellent skills in state - of-the-art technology of client server computing, desktop applications and website development.
  • Around 3 year of work experience on Big Data Analytics with hands on experience on writing Map Reduce jobs on Hadoop Ecosystem including Hive and Pig.
  • Good working experience on Hadoop architecture, HDFS, Map Reduce and other components in the Cloudera - Hadoop echo system .
  • Good working experience on Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop, Map Reduce, HDFS, Hive, Sqoop, Pig, Zookeeper and Flume.
  • Good working experience on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
  • Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
  • Expertise in Hadoop - Big data technologies: Hadoop Distributed File System (HDFS), Map Reduce, PIG, HIVE, HBASE, ZOOKEEPER, SQOOP.
  • Good working experience on Hadoop Cluster architecture and monitoring the cluster. In-depth understanding of Data Structure and Algorithms.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in implementing standards and processes for Hadoop based application design and implementation.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice versa.
  • Experience in managing Hadoop clusters using Cloudera Manager Tool.
  • In depth knowledge of database like MySQL and extensive experience in writing SQL queries, Stored Procedures, Triggers, Cursors, Functions and Packages.
  • Excellent knowledge of HTML, CSS, JavaScript, PHP.
  • Good working experience on Installing and maintaining the Linux servers.
  • Experience in Data Sharing and backup through NFS.
  • Experience in Monitoring System Metrics and logs for any problems Adding, removing, or updating user account information, resetting passwords, etc


Programming Languages: C, C++,JAVA

Scripting Languages: HTML,CSS, UNIX shell scripting

Database: MySQL


Operating: Windows, Linux, Mac OSX

Network administration: TCP/IP fundamentals, wireless networks, LAN and WAN

Hadoop: Map Reduce, HDFS, Hive, Pig, Sqoop, Flume,Oozie, Zookeeper



Confidential, BENTONVILLE, AR


  • It is an Analytical Sand Box with tools like SAS, SPSS, Tableau, and OPL/CPLEX having access to Big data environment Hadoop/Greenplum top perform Customer Level Analytics. It was a creation of Analytical Dataset with various variables from Market Basket-Item and Customer Demographic by processing 100's of Terabytes of data every day.
  • Roles & Responsibilities:
  • Migrated the Oozie workflows to new version during the upgradation of Hadoop cluster from cdh3u1 to cdh4.1.2
  • Developed the Oozie workflows for loading the full tables and partial tables in Hadoop using Sqoop, Hive
  • Implemented the new household data fix in Hadoop tables using the data from Experian and Teradata tables
  • Maintained the data in Teradata and Hadoop tables. Developed workflows to automate the row count differences between Hadoop and Teradata tables
  • Developed Shell scripts to report the disk usage by users on Hadoop clusters and automate the data clean up activity
  • Developed Oozie dashboard in Hue browser for business users to directly code the Oozie workflows as necessary
  • Performed data analytics using Hive HQL
  • Cluster coordination services through Zoo Keeper
  • Developing Map Reduce programs to perform analysis
  • Experienced in analyzing data with Hive and Pig

Environment: Oozie, Hive, Sqoop, Cloudera, SAS, SPSS, Unix Shell Scripts, Zoo Keeper, SQL, MapReduce, Pig


Confidential, BOSTON, MA


  • Confidential is the advertisement networking company that manages that links up advertisers with the online advertisement inventory. Large amount of high volume data from bidders and other networks was streamed daily to Google Storage platform. From Google Storage platform, the data was loaded into Hadoop Cluster. This data was maintained for 60 days for analysis purposes. 600-700 GB of data was loaded daily into the cluster.
  • Roles & Responsibilities:
  • The data was imported from bidders loggers into Hadoop Clusters via Google Storage
  • Data was pre-processed and fact tables were created using HIVE
  • The resulting data set was exported to Mysql for further analysis
  • Generated reports using Pentaho report designer
  • Automated all the jobs from pulling data from Google Storage to loading data into Mysql using shell scripts
  • Developed several advanced Map Reduce programs to process data files received.
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
  • Extracted the data from Teradata into HDFS using Sqoop.

Environment: Hadoop, MapReduce, HDFS, Hive, Flume, Sqoop, Cloudera, Oozie, SQL, Unix Shell Scripts


Confidential, SAN FRANCISCO, CA


  • Provided system support for MySQL servers as part of 24x7 teams
  • Performed Linux administration functions like installation, configuration, upgrading and ongoing management using automated tools
  • Systems to monitor via Nagios (NRPE) and report on system performance and utilization of MySQL database systems
  • Completed analysis of client, server, and infrastructure performance
  • Designing databases and tuning queries for MySQL
  • Query review and index optimizations

Environment: Mysql, Nagios




  • Performed data analytics using Hive HQL
  • Managed the Linux servers at Hosted environment
  • Provided monitoring, backup and other systems related tasks
  • Installing and maintaining the Linux servers.
  • Installed Linux using Pre-Execution environment boot and Kick-start method on multiple servers.
  • Data Sharing and backup through NFS.
  • Monitoring System Metrics and logs for any problems.
  • Adding, removing, or updating user account information, resetting passwords, etc

Environment: Hive, Linux, NFS

Hire Now