We provide IT Staff Augmentation Services!

Hadoop Administrator Resume



  • Around 6 years of experience in IT with over around 4 years of hands - on experience as Hadoop Administrator.
  • Hands on experience in deploying and managing multi-node development, testing and production of Hadoop Cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, ZOOKEEPER, HBASE) using Cloudera Manager and Hortonworks Ambari.
  • Hand on experience in Big Data Technologies/Framework like Hadoop, HDFS, YARN, MapReduce, HBase, Hive, Pig, Sqoop, NoSQL, Flume, Oozie.
  • Proficiency with the application servers like WebSphere, WebLogic, JBOSS and Tomcat.
  • Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Experience on Commissioning, Decommissioning, Balancing and Managing Nodes and tuning server for optimal performance of the cluster.
  • As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup& Recovery strategies.
  • Used Namespace support to map Phoenix schemas to HBase namespaces.
  • Designed and implemented database software migration procedures, and guidelines.
  • Performed administrative tasks on Hadoop Clusters using HortonWorks.
  • Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, Map Reduce, Sqoop, Apache NiFi, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Installing and monitoring the Hadoop cluster resources using Ganglia and Nagios.
  • Experience in designing and implementation of secure Hadoop cluster using Kerberos.
  • Experience with Hadoop Architecture and Big Data users to implement new Hadoop eco-system technologies to support multi-tenancy cluster.
  • Skilled in monitoring servers using Nagios, Data dog, Cloud watch and using EFK Stack Elasticsearch, Fluentd Kibana.
  • Implemented DB2/LUW replication, federation, and partitioning (DPF).
  • Areas of expertise and include: Database Installation/Upgrade, Backup/Recovery,
  • Hands on experience in installing, configuring, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4), Yarn distributions (CDH 5.X).
  • Experience on capacity planning, hdfs management and yarn resource management.
  • Hands on experience on configuring a Hadoop cluster in an enterprise environment and on VMWare and Amazon Web Services (AWS) using an EC2 instances.
  • Installed and configured a Hortonworks HDP 2.3.0 using AMBARI 2.1.1 manager.
  • Hands on experience in upgrading the cluster from HDP 2.0 to HDP 2.3.
  • Having strong experience/expertise in different Data warehouse tools including ETL tools like DataStage, etc. and BI tools like SSRS, Tableau.
  • Expertise in interactive data visualization and analyzing with BI tools like Tableau.
  • Worked with Different Relational Database systems like Oracle/PL/SQL.Used Unix Shell scripting, Python and Experience working on AWS EMR Instances.
  • Used NoSQL database with Cassandra, MongoDB, Monod and Designed table.
  • Worked on setting up Name Node High Availability for major production cluster and designed automatic failover control using Zookeeper and Quorum Journal Nodes.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Experience on Commissioning, Decommissioning, Balancing and Managing Nodes and tuning server for optimal performance of the cluster.
  • Familiar with writing Oozie workflows and Job Controllers for job automation.
  • Experience in dealing with structured, semi-structured and unstructured data in HADOOP ecosystem.
  • Importing data from various data sources, transformation using Hive, Pig, and loaded data into HBase.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Analysed the client's existing Hadoop infrastructure to understand the performance bottlenecks and provided performance tuning accordingly.
  • Good understanding in Deployment of Hadoop Clusters using Automated Puppet scripts.
  • Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring on Linux systems.
  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.


Hadoop Ecosystem: Hadoop 2.2, HDFS, MapReduce, Hive, Pig, Zookeeper, Sqoop, Oozie, Yarn, Apache NiFi, Apache Phoenix, Spark, Kafka, Storm, Ambari1.0-2.1.1, Kerberos, Flume.

Hadoop Management &Security: Hortonworks, Cloudera Manager.

Web Technologies: HTML, XHTML, XML, XSL, CSS, JavaScript

Server Side Scripting: Shell, Perl, Python.

Database: Oracle 10g, Microsoft SQL Server, MySQL, DB2, SQL, RDBMS.

Web Servers: Apache Tomcat 5.x, BEA WebLogic 8.x, IBM, WebSphere 6.0/ 5.1.1

Programming Languages: C, Java, Pl SQLNO SQL

Databases: HBase, Mongo DB

Virtualization: VMware, ESXI, VSphere, VCenter Server.

SDLC Methodology: Agile (SCRUM), Waterfall.

Operating Systems: Windows 2000 Server, Windows 2000 Advanced Server, Windows Server 2003 Centos, Windows 98/XP UNIX, Linux RHEL, DB2


Hadoop Administrator

Confidential, Texas


  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
  • Experience in cluster coordination using Zookeeper.
  • Expertise with NoSQL databases likeHbase, Cassandra, DynamoDB (AWS) and MongoDB.
  • Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
  • Worked on security components like Kerberos, ranger, sentry, hdfs encryption.
  • Created Oozie workflows to automate data ingestion using Sqoop and process incremental log data ingested by Flume using Pig.
  • Load log data into HDFS using Flume, Kafka and performing ETL integrations.
  • Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • Extracted the data from Teradata into HDFS using the Sqoop.
  • Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
  • I was also involved in core components of HDP like Yarn and HDFS, which I was using to get architect platform.
  • Complete end to end design and intergration of Apache NiFi.
  • Involved in setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users.
  • I use Phoenix for Hive integration to join huge tables to other huge tables.
  • With Phoenix Execute aggregate queries through server-side hooks (called co-processors).
  • Installed and configured Hortonworks Distribution Platform (HDP 2.3) on Amazon EC2 instances.
  • Used Python for instantiating multi-threaded application and running with other applications.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, Cassandra and slots configuration.
  • Created HD Insight cluster in Azure (Microsoft Specific tool) was part of the deployment and Component unit testing using Azure Emulator.
  • Design the Elasticsearch configuration files based on number of hosts available, naming the cluster and node accordingly.
  • Worked on troubleshooting for LDAP and SiteMinder issues with Support Teams for newer initiatives at organization level.
  • Created interactive dashboards utilizing parameters, actions, and calculated fields utilizing Tableau Desktop.
  • Provided support on Kerberos related issues and Coordinated Hadoop installations/upgrades and patch installations in the environment.

Environment: Hive, Pig, HBase, Zookeeper and Sqoop, ETL, Azure, Hortonworks, Apache Phoenix, Ambari 2.0, Apache NiFi, Linux Cent OS, HBase, Splunk, MongoDB, Elasticsearch, Teradata, Puppet,Kerberos, Kafka, Cassandra, Linux/Unix, Python, Agile/scrum.

Hadoop Administrator

Confidential, MN


  • Involved in capacity planning, with to the growing data size and the existing cluster size.
  • Worked on analysing Hadoop cluster and different big data analytic tools including Pig, HBase, NoSQL, and databases, Flume, Oozie and Sqoop.
  • Experience in designing, implementing and maintaining of high performing Bigdata, Hadoop clusters and integrating them with existing infrastructure.
  • Deployed the application and tested on Websphere Application Servers.
  • Configured SSL for Ambari, Ranger, Hive and Knox.
  • Experience in methodologies such as Agile, Scrum, and Test driven development.
  • Creating principles for new users in the Kerberos and Implemented and maintained Kerberos cluster and integrated with the Active Directories (AD).
  • Worked with data pipeline using Kafka and Storm to store data into Hdfs.
  • Creating event processing data pipelines and handling messaging services using Apache Kafka.
  • Involved in migrating java test framework to python flask.
  • Shell scripting for Linux/Unix Systems Administration and related tasks. Point of Contact for Vendor escalation.
  • Monitoring and analysing MapReduce jobs and look out for any potential issues and address them.
  • Collected the logs data from web servers and integrated into HDFS using Flume.
  • Moving the data from Oracle, Teradata, MySQL into HDFS using Sqoop and importing various formats of flat files into HDFS.
  • Assisted in discussions of redesigning LDAP architecture for older environments.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally inMapReduce way.
  • I use Phoenix Query Server to create a new ZooKeeper connection for each “client session”.
  • With Phoenix Support for updatable view to extend primary key of base table.
  • Commissioning and Decommissioning Hadoop Cluster Nodes Including Load Balancing HDFS block data.
  • Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment.
  • Good knowledge in implementing Name Node Federation and High Availability of Name Node and HadoopCluster using Zookeeper and Quorum-Journal Manager.
  • Good knowledge in adding security to the cluster using Kerberos and Sentry.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Exported the patterns analyzed back to Teradata using Sqoop.
  • Hands-On experience in setting up ACL (Access Control Lists) to secure access to the HDFS file system.
  • Analyze escalated incidences within the Azure SQL database.
  • Captured the data logs from web server into HDFS using Flume & Splunk for analysis.
  • Fine Tuned Hadoop cluster by setting proper number of map and reduced slots for the Task Trackers.
  • Experience in tuning the heap size to avoid any disk spills and to avoid OOM issues.
  • Familiar with job scheduling using Fair Scheduler so that CPU time is well distributed amongst all the jobs.
  • Experience managing users and permissions on the cluster, using different authentication methods.
  • Involved in regular Hadoop Cluster maintenance such as updating system packages.
  • Experience in managing and analysing Hadoop log files to look troubleshooting issues.
  • Good knowledge in NoSQL databases, like HBase, MongoDB, etc.
  • Working on Hadoop Hortonworks distribution which managed services viz. HDFS, MapReduce2

Environment: Hadoop, YARN, Hive, HBase, Flume, Hortonworks, Apache Phoenix, Kafka, Zookeeper, Oozie and Sqoop, MapReduce, Ambari, HDFS, Teradata Splunk, Elasticsearch, Jenkins, GitHub, Kerberos, MySQL, Apache NiFi, NoSQL, MongoDB, Java, Shell Script, Python, Linux/Unix.

Hadoop Administrator

Confidential, Greenville, South Carolina


  • Responsible for architecting Hadoop cluster.
  • Involved in source system analysis, data analysis, data modelling to ETL (Extract, Transform and Load) and HiveQL
  • Strong Experience in Installation and configuration of Hadoop ecosystem like Yarn, HBase, Flume, Hive, Pig, Sqoop.
  • Expertise in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
  • Manage and review Hadoop Log files.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Worked extensively with Sqoop for importing data.
  • Designed a data warehouse using Hive.
  • Created partitioned tables in Hive.
  • Mentored analyst and test team for writing Hive Queries.
  • Extensively used Pig for data cleansing.
  • Scheduled Oozie workflow engine to run multiple Hive and Pig jobs, which independently run with time and data availability.
  • Worked on Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
  • Build and published customized interactive reports and dashboards using Tableau server.
  • Developed pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
  • Worked on pulling the data from relational databases, Hive into the Hadoop cluster using the Sqoop import for visualization and analysis.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.

Environment: Hadoop, HDFS, Map-Reduce, hive, Hortonworks, pig, Kafka, Oozie, Zookeeper, Sqoop, Nagios, Cloudera Manager MySQL, NoSQL, MongoDB, Java, Linux/Unix.

Linux Administrator



  • Administration of RHEL, which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Creating, cloning Linux Virtual Machines.
  • Installing Red Hat Linux using kick start and applying security polices for hardening the server based on the company policies.
  • RPM and YUM package installations, patch and other server management.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cronjobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Tech and non-tech refresh of Linux servers, which includes new hardware, OS, upgrade, application installation, testing.
  • Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, and user and group quota.
  • Installing MySQLDB in Linux and Customize the MySQL DB parameters.
  • Working with Service Now incident tool.
  • Creating physical volumes, volume groups and logical volumes.
  • Samba Server configuration with Samba Clients.
  • Knowledge of IP tables, SELINUX.
  • Modified existing Linux file systems to a Standard EXT3.
  • Configuration and administration of NFS FTP, SAMBA, NIS.
  • Maintenance of DNS, DHCP and APACHE services on Linux machines.
  • Installing and configuring Apache and supporting them on Linux production servers.

Environment: Red-Hat Linux Enterprise servers, VERITAS Cluster Server 5.0, Windows 2003 server, Shell programming, Unix/Linux.

Hire Now