We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Malvern, PA


  • 8 years of experience in design and implementations of robust technology systems, with specialized expertise in Hadoop Administration and Linux Administration.
  • 4+ years of experience in Hadoop Administration & Big Data Technologies and 4 years of experience into Linux administration.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Hortonworks, Cloudera
  • Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Design Big Data solutions for traditional enterprise businesses.
  • Excellent command in creating Backups & Recovery and Disaster recovery procedures and Implementing BACKUP and RECOVERY strategies for off - line and on-line Backups.
  • Involved in bench marking Hadoop/ Hbase cluster file systems various batch jobs and workloads.
  • Making Hadoop cluster ready for development team working on POCs.
  • Experience in minor and major upgrades of Hadoop and Hadoop eco system.
  • Experience monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network.
  • Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
  • Experience on Commissioning, Decommissioning, Balancing and Managing Nodes and tuning server for optimal performance of the cluster.
  • Experience in integrating Shell scripts using Jenkins
  • As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
  • Experience with Configuration Management Tools (PUPPET, Ansible).
  • Experience in Chef, Puppet or related tools for configuration management.
  • Good Experience in setting up the Linux environments, Password less SSH, creating file systems, disabling firewalls, Swappiness, Selinux and installing Java.
  • Good Experience in Planning, Installing and Configuring Hadoop Cluster in Cloudera and Hortonworks Distributions.
  • Hands on experience in Installing, Configuring and managing the Hue and Hcatalog.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
  • Production experience in large environments using Configuration Management Tools like Puppet and Ansible.
  • Experience in importing and exporting the logs using Flume.
  • Expertise in designing Python scripts to interact with middleware/back end services.
  • Optimizing performance of Hbase/Hive/Pig jobs.
  • Hands on experience in Zookeeper and ZKFC in managing and configuring in Name Node failure scenarios.
  • Handsome experience in Linux admin activities on RHEL & Cent OS.
  • Experience in deploying Hadoop 2.0(YARN).
  • Familiar with writing Oozie workflows and Job Controllers for job automation.
  • Hands on experience in provisioning and managing multi-tenant Hadoop clusters on public cloud environment- Amazon Web Services (AWS)-EC2 and on private cloud infrastructure - Open Stack cloud platform.
  • Hands-on experience in configuration and management of security for Hadoop cluster using Kerberos.
  • Experience in setting up and managing the High-Availability to avoid a single point of failure on large Hadoop Clusters.
  • Knowledgeable of spark and Scala mainly in framework exploration for the transition from Hadoop/Map Reduce to spark.
  • Working with applications teams to install the operating system, Hadoop updates, patches, version upgrades as required.
  • Good Knowledge in Amazon AWS concepts like EMR, S3, and EC2 web services which provide fast and efficient processing of Hadoop.
  • Experience in writing Shell scripts for various purposes like file validation, automation and job scheduling using Crontab.


Big Data Technologies: Hortonworks, HDFS, Hive, Map Reduce, Cassandra, Pig, Hcatalog, Phoenix, Falcon, Apache NiFi, Scoop, Zookeeper, Mahout, Flume, Oozie, Avro, HBase, MapReduce, HDFS, Storm, Cloudera.

Scripting Languages: Shell Scripting, Puppet,chef, Scripting, Python, Bash, CSH, Ruby, PHP

Databases: Oracle 11g, MySQL, MS SQL Server, Hbase, Cassandra, MongoDB


Monitoring Tools: Cloudera Manager, Solr, Ambari, Nagios, Ganglia

Application Servers: Apache Tomcat, WebLogic Server, Web Sphere

Security: Kerberos, Knox.

Reporting Tools: Cognos, Hyperion Analyzer, OBIEE & BI+

Analytic Tools: Elastic search-Logstash-Kibana


Hadoop Administrator

Confidential, Malvern, PA


  • Experienced as admin in Hortonworks (HDP 2.5.3) distribution for 5 clusters ranges from POC to PROD.
  • Cluster capacity planning depend upon the data usage
  • Designed and configure the Baston-Edge node configuration
  • Designed and configure HA of Hive & Hbase services
  • Identified the root cause of zookeeper and spark logs (spark and zookeeper log are .out only) and collected all log files and integrated to cloud watch (AWS-ec2)
  • Changed the zookeeper and journal node edit directories (zookeeper and journal nodes has multiple directories)
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage &review log files.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Experienced on adding/installation of new components and removal of them through Ambari.
  • Monitoring systems and services through Ambari dashboard to make the clusters available for the business.
  • Made use of automation tools Ansible to push updates across the nodes.
  • Automated workflows using shell scripts pull data from various databases into Hadoop.
  • Architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Hand on experience on cluster up gradation and patch upgrade without any data loss and with proper backup plans.
  • Changing the configurations based on the requirements of the users for the better performance of the jobs.
  • Experienced in Ambari-alerts (critical & warning) configuration for various components and managing the alerts.
  • Provided security and authentication with ranger where ranger admin provides administration and user sync adds the new users to the cluster.
  • Developed Shell and Python scripts to automate and provide Control flow to Pig scripts.
  • Good troubleshooting skills on Hue, which provides GUI for developer's/business users for day to day activities.
  • Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Implemented complex Map Reduce programs to perform joins on the Map side using distributed cache
  • Implemented Name Node HA in all environments to provide high availability of clusters.
  • Involved in snapshots and mirroring to maintain the backup of cluster data and even remotely.
  • Experienced in managing and reviewing log files. (identify the maxbackup index and maxbackup size of vlog4j properties of all services in Hadoop)
  • Helping the users in production deployments throughout the process.
  • Experienced in production support which involves solving the user incidents varies from sev1 to sev5.
  • Managed and reviewed Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • As an admin followed standard Back up policies to make sure the high availability of cluster.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Documented the systems processes and procedures for future references.
  • Worked with systems engineering team to plan and deploy new environments and expand existing clusters.
  • Monitored multiple clusters environments using AMBRI Alerts, Metrics

Environment: Hdfs, Yarn, Map reduce, pig, zoo keeper, spark, Kafka, Shell Scripting and Python Scripting, Ansible, Hortonworks, and Ambari.

Hadoop Administrator

Confidential, Oak Brook, IL


  • Worked as admin on Cloudera (CDH 5.5.1) distribution for 4 clusters ranges from POC to PROD.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Adding/installation of new components and removal of them through Cloudera Manager.
  • Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
  • Involved in developing custom scripts using Shell (bash, ksh) to automate jobs.
  • Experience with cloud AWS/EMR, Cloudera Manager (also direct-Hadoop-EC2(non EMR))
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in Analyzing system failures, identifying root causes and recommended course of actions.
  • Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations.
  • Rack Aware Configuration and AWS working nature
  • Working experience in supporting and deploying in an AWS environment
  • Imported logs from web servers with Flume to ingest the data into HDFS.
  • Using Flume and Spool directory loading the data from local system to Hdfs
  • Retrieved data from HDFS into relational databases with Sqoop. Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis
  • Fine tuning hive jobs for optimized performance.
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Scheduled several time based Oozie workflow by developing Python scripts.
  • Partitioned and queried the data in Hive for further analysis by the BI team.
  • Extending the functionality of Hive and Pig with custom UDF s and UDAF's.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop..
  • Creating and truncating hbase tables in hue and taking backup of submitter ID(s).
  • Configuring, Managing permissions for the users in hue.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Commissioned and Decommissioned nodes on CDH5 Hadoop cluster on Red hat LINUX.
  • Involved in loading data from LINUX file system to HDFS.
  • Creating and managing the Cron jobs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Experience in configuring the Storm in loading the data from MYSQL to HBASE using jms
  • Responsible to manage data coming from different sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Experience in managing and reviewing Hadoop log files.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hdfs, Map reduce, Hive 1.1.0, Hue 3.9.0, Pig, Flume, Oozie, Sqoop, Cdh5, Apache Hadoop 2.6, Spark, Solr, Storm, Cloudera Manager, Red hat, MySQL and Oracle.

Linux/Hadoop administrator



  • Managing UNIX Infrastructure involves day-to-day maintenance of servers and troubleshooting.
  • Provisioning Red Hat Enterprise Linux Server using PXE Boot according to requirements.
  • Performed Red Hat Linux Kick start installations on Red Hat 4.x/5.x, performed Red Hat Linux Kernel Tuning, memory upgrades.
  • Working with Logical Volume Manager and creating of volume groups/logical performed Red Hat Linux Kernel Tuning.
  • Checking and cleaning the file systems whenever it's full. Used Log watch 7.3, which reports server info as scheduled.
  • Had hands on experience in installation, configuration, maintenance, monitoring, performance and tuning, and troubleshooting Hadoop clusters in different environments such as Development Cluster, Test Cluster and Production.
  • Configured Job Tracker to assign Map Reduce Tasks to Task Tracker in cluster of Nodes
  • Configured Job Tracker to assign Map Reduce Tasks to Task Tracker in cluster of Nodes
  • Implemented Kerberos security in all environments.
  • Defined file system layout and data set permissions.
  • Implemented Capacity Scheduler to share the resources of the cluster for the Map Reduce jobs given by the users
  • Worked on importing the data from oracle databases into the Hadoop cluster.
  • Managed and reviewed data backups and log files and worked on deploying Java applications on cluster.
  • Commissioning and Decommissioning Nodes from time to time.

Environment: Red Hat Enterprise Linux 3.x/4.x/5.x, Sun Solaris 10, on Dell Power Edge servers, Hive, HDFS, Map Reduce, Sqoop, Hbase

Linux System Engineer



  • Installation and configuration of Linux for new build environment.
  • Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
  • Configuring NFS, DNS.
  • Updating YUM Repository and Red Hat Package Manager (RPM).
  • Created RPM packages using RPMBUILD, verifying the new build packages and distributing the package.
  • Configuring distributed file systems and administering NFS server and NFS clients and editing auto-mounting mapping as per system / user requirements.
  • Installation, configuration and maintenance FTP servers, NFS, RPM and Samba.
  • Configured SAMBA to get access of Linux shared resources from Windows.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
  • Activities include user administration, startup and shutdown scripts, crontabs, file system maintenance and backup scripting and automation using shell scripting (BASH, KSH) and Perl for RedHat Linux systems
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Experience with Linux internals, virtual machines, and open source tools/platforms.
  • Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
  • Ensured data recoverability by implementing system and application level backups.
  • Performed various configurations which include networking and Iptables, resolving hostnames, SSH keyless login.
  • Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
  • Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
  • Automate administration tasks through use of scripting and Job Scheduling using CRON.

Environment: Linux, Citrix Xen Server 5.0, Veritas volume manager and net back up.

Hire Now