We provide IT Staff Augmentation Services!

Hadoop Admin Resume

San Ramon, CA


  • Professional with 7+ years of IT experience in industries including Banking, Healthcare and Education.
  • Cloudera Certified Hadoop administrator with 5+ years of experience in activities such as installation and configuration of clusters using Apache Hadoop, Cloudera (CDH), Hortonworks (HDP), AWS Elastic Map Reduce (EMR).
  • Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) using Agile Methodologies.
  • Deployed Hadoop cluster on on-premises Data Centers and Private Cloud Environments.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4), Yarn distributions (CDH 5.X).
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components likeHDFS, Yarn, MapReduce, Spark, HBase, Oozie, Hive, Sqoop, Flume, Storm, Kafka and Sentry.
  • Installed and configured setting up of automated 24x7 monitoring and escalation infrastructure for Hadoop clusters using Nagios and Ganglia.
  • Experienced on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • Experienced in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm /domain.
  • Experienced in creating service principles, user principles and establishing cross-realm trust.
  • Strong knowledge on enforcing RBACto hive data and metadata on a Hadoop cluster using Sentry.
  • Experienced in setting up the High-Availability Hadoop Clusters and BDR clusters.
  • Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Good experience on configuring and managing the backup and disaster recovery for Hadoop data.
  • Hands on experience in managing, reviewing and analyzing Log files for Hadoop and eco system services and finding root cause by performing root cause analysis.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
  • Experienced in HDFS data storage and support for running map-reduce jobs and spark jobs.
  • Experienced in importing and exporting the data using Sqoop from HDFS to RDBMS and Scheduling all Hadoop/Hive/Sqoop/HBase jobs using Oozie.
  • Experienced on Rack awareness configuration for quick availability and processing of data.
  • Involved in balancing the loads on server,tuning of server for optimal performance of the cluster.
  • Handsome experience in Linux admin activitiessuch as user management, scheduling cron jobs, setting up the Linux environments, Password less SSH, creating file systems, disabling firewalls, Swappiness, Selinux and installing Java.
  • Experienced in configuration management tools like puppet, Ansible and chef.
  • Good understanding in Deployment of Hadoop Clusters Using Automated Puppet scripts
  • Experienced in hardware recommendations, performance tuning and benchmarking
  • Experienced in IP Management (IP Addressing, Sub-netting, Ethernet Bonding, Static IP)
  • Flexible with Unix/Linux and Windows Environments working with OS like Centos 5/6, Ubuntu 10/11 and RedHat 6/7.
  • Experienced in Linux Storage Management. Configuring RAID Levels, Logical Volumes.



Scripting Language: Shell, bash

Hadoop Ecosystem: Hadoop 2.7.x, Spark 2.1.0, MapReduce, Hive 2.1.1, Sqoop 1.99.7, Flume 1.7.0, Kafka 2.1.0, Oozie 3.1.3, Yarn 0.21.3, Pig 0.14, Zookeeper 3.4.6

Database: MySQL 5.x, Oracle 11g, HBase 1.3.0, Cassandra 3.10, PL/SQL 11g, MS SQL Server.

Hadoop Distributions : Cloudera, Hortonworks, AWS EMR, Apache Hadoop

IDE Application: Eclipse 4.6, Net beans

Collaboration: Git 2.12.0, Scala Test 3.0.1

Operating Systems: Windows10, Mac OS, Ubuntu, Centos, Red hat

Data Analysis & Viz: Tableau

Cloud Environment: AWS EC2, S3, IAM, VPC, ROUTE53, Cloud watch, Cloudera Formation

Web Services: RESTful, SOAP


Confidential, San Ramon, CA

Hadoop Admin


  • Provided infrastructure support for multiple clusters like Production(Prod), Pre-Production(Pre-prod), Quality (QA) and Disaster Recovery(DR)
  • Installed and configured Hadoop cluster across various environments through Cloudera Manager
  • Installed and configured MYSQL and Enabled High Availability.
  • Installed and configured Sentry server to enable schema level Security.
  • Installed and configured Hadoop services HDFS, Yarn, MapReduce, Spark, HBase, Oozie, Hive, Sqoop, Flume, Kafka and Sentry.
  • Configured Fair schedulers in cluster, created resource pools, and dynamic resource allocation of resources during regular monitoring of resource intensive jobs
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper services
  • Day to day responsibilities includes solving hadoop developer issues and providing instant solution to reduce the impact and documenting the same and preventing future issues.
  • Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations.
  • Experienced in upgrades, patching, Rolling Upgradesactivities without any data loss and with proper backup plans.
  • Integrated external components like Tibco and Tableau with Hadoop using Hive server2.
  • Implemented HDFS snapshot feature, migrated data across clusters using DISTCP.
  • Performed both major and minor upgrades to the existing Cloudera Hadoop cluster.
  • Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
  • Build a new sandbox cluster for the testing purpose and move data from secure cluster to insecure sandbox cluster by usinga tool DISTCP (distributed copy).
  • Installed Kafka cluster with separate nodes for brokers.
  • Performed Kafka operations on regular basis.
  • Expertise in Performance tuning and optimized Hadoop clusters to achieve high performance.
  • Implemented schedulers on the Resource Manager to share the resources of the cluster.
  • Monitoring Hadoop Clusters using Cloudera Manager and 24x7 on call support.
  • Expertise in implementation and designing of disaster recovery plan for Hadoop Cluster.
  • Extensive hands on experience in Hadoop file system commands for file handling operations.
  • Worked on Providing User support and application support on Hadoop Infrastructure.
  • Prepared System Design document with all functional implementations.
  • Worked with SQOOP import and export functionalities to handle large data set transfer betweentraditional databases and HDFS.

Environment: Hadoop Hdfs, Mapreduce, Hive, Pig, Oozie, Sqoop, Cloudera Manager, Storm, AWS S3, Ec2, IAM, Zookeeper, spark

Confidential, Valley Forge, PA

Hadoop Admin


  • Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
  • Conducting RCA to find out data issues and resolve production problems.
  • Worked on Hive optimization techniques to improve the performance of long running jobs.
  • Experienced in managing and reviewing Hadoop log files
  • Worked with Sqoop in Importing and exporting data from different RDMSinto HDFS and Hive.
  • Worked on setting up HA for major production cluster and designed automatic failovercontrol using zookeeper and quorum journal nodes.
  • Experience on HBase High availability and manually tested using failover tests.
  • Create queues and allocated the clusters resources to provide the priority for jobs.
  • Experience in upgrading the cluster to newer versions of CDH 5.8.2 and CM 5.9.1
  • Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive, Falcon, Smartsense, Storm, Kafka and Spark.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Continuous monitoring and managing the Hadoop cluster through Cloudera manager.
  • Created user accounts and given users the access to the Hadoop cluster.

Environment: Hadoop Hdfs, MapReduce, Hive, Pig, Flume, Oozie, Sqoop, spark, Cloudera Manager.

Confidential, Westbury, NY

Hadoop Admin


  • Involved in deploying a Hadoop cluster using Hortonworks Ambari HDP 2.2 integrated with Site scope for monitoring and Alerting.
  • Launching and Setup of HADOOP Cluster on physical servers, which includes configuring different components of HADOOP.
  • Created a local YUM repository for installing and updating packages.
  • Developed data pipelines that ingests data from multiple data sources and process them.
  • Expertise in Using Sqoop to connect to the ORACLE, MySQL, SQL Server, TERADATA and move the pivoted data to Hive tables or HBase tables.
  • Implemented Kerberos authentication-KDC server setup, creating realm /domain, managing principles, generating key tab file for each service and managing key tab using key tab tools.
  • Configured Knox for perimeter security and Ranger for granular access in the cluster.
  • Configured and installed several Hadoop clusters in both on-premises and AWS cloud for POCs.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Extensively used Sqoop to move the data from relational databases to HDFS.
  • Used Flume to move the data from web logs onto HDFS.
  • Used Pig to apply transformations, validations, cleaning and deduplication of data from sources.
  • Actively monitored the Hadoop Cluster with Hortonworks distribution with HDP 2.4.
  • Performed various configurations, which includes, networking and Iptables, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
  • Worked on performing minor upgrade from HDP 2.2.2 to HDP 2.2.4
  • Upgraded the Hadoop cluster from HDP 2.2 to HDP 2.4 and HDP 2.4 to HDP 2.5
  • Integrated BI tool Tableau to run visualizations over the data.
  • Solving hardware related Issues Ticket assessment on daily basis.
  • Automate administration tasks using scripts and Job Scheduling using CRON.
  • Provided 24 x 7 on call support as part of a scheduled rotation with other team members




Roles & Responsibilities:

  • Maintained SQL Server DB; ensured accuracy of information and automated numerous functions.
  • Fine tuning of database objects and server to ensure efficient data retrieval.
  • Monitor and optimize system performance using SQL Profiler and DB Engine Tuning Advisor.
  • Designed and implemented incremental and full back up policies and procedures.
  • Created and implemented database design solutions in collaboration with programming team.
  • Developed user defined functions and triggers to implement the requirements of the business.
  • Performed database logical and physical design, maintenance, tuning, archiving, backups,replication, recovery, software upgrades, capacity planning and optimization for SQL Server database.
  • Database consistency checks using DBCC utilities, Performance Baselining, Performance Tuning
  • Production level support for onsite and offshore Clients.
  • SQL Server Performance Dashboard Reports for Monitoring.
  • Database Security management


SQL Developer

Roles & Responsibilities:

  • Developed stored procedures, functions and database triggers. Maintained referential integrity and implemented complex business logic.
  • Involved in installation and configuration of SQL server 2005 with latest service packs.
  • Created and executed SSIS packages to populate data from the various data sources.
  • Created SSIS packages using SSIS designer for export heterogeneous data from OLEDB Source (Oracle), Excel spreadsheet to SQL Server 2005.
  • Migrated DTS packages to SSIS packages and modified those packages.
  • Designed ETL packages dealing with different data source and loaded the data into target data sources by performing different kinds of transformations using SSIS.
  • Experience in creating multiple reports (SSRS) in Drill mode using tables, crosstabs, and charts. Design, deployment and maintenance of various SSRS in SQL Server 2005.
  • Designed and implemented parameterized and cascading parameterized reports using SSRS.
  • Managed the security of servers, creating the new logins and users, changing roles of users.
  • Involved in developing logical and physical model of database using Erwin.

Hire Now