We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Oaks, PA


  • Over 7 years of experience in large - scale Information Technology projects.
  • Around 3 years of experience in implementation of Big Data analytics based on Hadoop ecosystem.
  • Maintained Hadoop clusters for Development/Quality/Production environments.
  • Experience with Hortonworks & Cloudera Manager Administration.
  • Excellent knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN, Resource Manager, Node Manager & MapReduce.
  • Hands on experience on Installation and configuration of Apache Hadoop cluster on Linux platform.
  • Adept in installing, configuring and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Pig, Hive, Sqoop, Zookeeper, Flume and Kafka.
  • Good experience in implementing Kerberos & Ranger in Hadoop Ecosystem.
  • Provided level 1, 2 & 3 technical supports.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database systems and vice-versa.
  • Strong knowledge of Hadoop platforms and other distributed data processing platforms.
  • Monitor Hadoop cluster using tools like Nagios, Ganglia, Ambari & Cloudera Manager.
  • Investigated on new technologies like Spark to catch up with industry developments.
  • Good Knowledge of Informatica and SAS.
  • Experience creating reports using Tableau.
  • Strong knowledge in writing Shell Scripts.
  • Demonstrated leadership capability and ability to effectively communicate with all levels of management.
  • Worked with business users to extract clear requirements to create business value.
  • Worked with big data teams to move ETL tasks to Hadoop.
  • Exceptionally well organized that demonstrates self-motivation, learning, creativity & initiatives, extremely dedicated & possess skills in actively learning new technologies within short span of time.
  • Achieved SLA/SLO standards.
  • Strong experience in 24x7 environments
  • Good experience in both Waterfall & Agile methodologies.


Hadoop: HDFS, MapReduce, Yarn, Hive, HBase, Impala, Hue, HCatalog, Pig, Sqoop, Flume, Oozie, Zookeeper, Ranger, Kerberos, Knox, Cassandra

Operating System: UNIX, Linux, Windows vista/XP/ 2000/98/95/7/8, MS DOS

Database Languages: SQL, PL/SQL

NoSQL Databases: HBase, Cassandra

Hadoop Cloud: AWS, Rackspace, S3

Hadoop Monitoring & Management: Ganglia, Ambari, Nagios, Ambari, Cloudera Manager

Hadoop Distributions: Hortonworks HDP, AWS, Cloudera Manager

Other Tools: MS-OFFICE - Excel, Word, PowerPoint, Adobe Photoshop, Macromedia Flash


Confidential, Oaks, PA

Hadoop Administrator


  • Helped the team to increase cluster size from 35 nodes to 80+ nodes. The configuration for additional data nodes was managed using Puppet.
  • Performed both major and minor upgrades to the existing Hortonworks Hadoop cluster.
  • Integrated Hadoop with Active Directory and enabled Kerberos for Authentication.
  • Applied patches and bug fixes on Hadoop Clusters.
  • Worked on setting up High Availability for major production cluster and designed automatic failover.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Experience in setting up a new cluster from existing cluster using Ambari blueprint.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Configured Hive metastore with MySQL, which stores the metadata for Hive tables.
  • Experience in setting up Queues so as the application teams SLA is met and finally ensured in having a robust capacity scheduling.
  • Worked on Apache Ranger for HDFS, HBase, Hive access and permissions to the users through active directory.
  • Worked on different compression codecs and ORC file format.
  • Worked on integration of Hiveserver2 with Tableau.
  • Benchmarking Hadoop cluster using DFSIO, Teragen and Terasort.
  • Installed, configured Disaster Recovery using Falcon.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with active directory for managing users and application groups.
  • Moved data from HDFS to RDBMS and vice-versa using SQOOP.
  • Performed Analyzing/Transforming data with Hive and Pig.
  • Mentoring & supporting team members in managing Hadoop cluster
  • Reviewing and updating all configuration & documentation of Hadoop clusters as part of continuous improvement processes.
  • Performed data completeness, correctness, data transformation & data quality testing using available tools & techniques
  • Wrote Shell Scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Developed some Map reduce jobs in java for data cleaning and preprocessing.
  • Provided rotational on call support and educated users on new implementations as per maintenance.

Environment: Hortonworks Data Platform, Linux, HBase, Hive, Pig, Kerberos, Ranger, Knox, Sqoop, Oozie, Zookeeper, Capacity Scheduler, Fairscheduler, Shell Scripting, MySQL, Ganglia, Nagios, Git & Jenkins tools, RHEL, CentOS etc.

Confidential, Sacramento, CA

Hadoop Technical Consultant


  • Installation and Administration of a Hadoop cluster using Cloudera Distribution.
  • Debugging and troubleshooting the issues in development and Test environments.
  • Commissioned data nodes as per the cluster capacity and decommissioned when the hardware degraded or failed.
  • Designed, configured and managed the backup and disaster recovery for HDFS data.
  • Designed workflow by scheduling Hive processes for file data, which is streamed into HDFS using Flume.
  • Design & Develop ETL workflow using Oozie for business requirements, which includes automating the extraction of data from MySQL database into HDFS using Sqoop scripts.
  • Performed systems monitoring, upgrades, performance tuning.
  • Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
  • Performed migration of data across clusters using DISTCP.
  • Implemented schedulers on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
  • Cloudera Enterprise Navigator for Hadoop Audit files and Data Lineage.
  • Monitoring Hadoop Clusters using CM UI, Ganglia and 24x7 on call support.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper services.
  • Worked on Impala performance tuning with different workloads and file formats.
  • Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP and Pig Latin.
  • Worked on providing User support and application support on Hadoop Infrastructure
  • Worked with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig and Flume.
  • Preformed Linux server administration to the server hardware and operating system.

Environment: CDH, Cloudera Manager, RHEL, Hive, Pig Latin, Oozie, SQOOP, HBase, Cassandra, AWS, S3, Ganglia.

Confidential, Dallas, TX

Hadoop Consultant


  • Installed, configured and deployed a 30 node Cloudera Hadoop cluster for development and production.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Loading the data from the different Data sources like (Teradata and DB2) into HDFS using Sqoop and load into Hive tables.
  • Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
  • Developed Pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
  • Developed Hive queries for Analysis across different banners.
  • Worked on pulling the data from relational databases, Hive into the Hadoop cluster using the Sqoop import for visualization and analysis.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Used Ganglia and Nagios for monitoring the cluster around the clock.
  • Used Nagios plugins to monitor Hadoop Name Node Health status, number of Task trackers running, number of Data nodes running.
  • Designed and implemented a distributed network monitoring solution based on Nagios and Ganglia using puppet.
  • CRON job automation for backups and recurring jobs needed for other applications.
  • 24x7 on call support on a rotational basis.

Confidential, New York

LINUX System Administrator


  • Installation and configuration of Red Hat Linux, Solaris, Fedora and Centos OS on new server builds as well as during the upgrade situations.
  • Log management like monitoring and cleaning the old log files.
  • System audit report like no. of logins, success & failures, running cron jobs.
  • System performance for hourly basis or daily basis.
  • Remotely coping files using sftp, ftp, scp, winscp, and filezilla.
  • Created user roles and groups for securing the resources using local OS authentication.
  • Configuring & monitoring DHCP server.
  • Taking backup using tar and recovering during the data loss.
  • Experience in writing bash scripts for job automation.
  • Maintaining relations with project managers, DBA's, Developers, Application support teams and Operational support teams to facilitate effective project deployment.
  • Manage system installation, troubleshooting, maintenance, performance tuning, managing storage resources, network configuration to fit application and database requirements.
  • Responsible for modifying and optimizing backup schedules and developing shell scripts for it.
  • Performed regular installation of patches using RPM and YUM.
  • Experience managing various file systems using LVM and SVM and also configured file systems through network using NFS, NAS, SAN methodologies and installed RAID devices.

Confidential, Boston, MA

Linux Administrator


  • Installing and maintain the Linux servers.
  • Good knowledge on Apache Web Server, Mail Servers, Database Servers, DNS Servers, Backup Management, Server Security Management, Live migration between servers remotely, Virtualization (Open VZ) and RAID devices.
  • MySQL Database - Tuning MySQL server parameters, repairing tables if crashed, error logging, database Maintenance, ensuring remote MySQL availability.
  • Monitoring System Metrics and logs for any problems.
  • Running cron-tab & autosys jobs to back up MySQL data.
  • Installing and updating packages using YUM.
  • Performance Monitoring - DDOS monitoring, Load Balancing, Remote Desktops, VNC, etc.
  • Application Protocols - Excellent understanding of SMTP, HTTP, FTP, PPTP and Open VPN Systems/Hardware - IPMI, KVMs.
  • Security - IP Tables, CSF, SSH. Preventing DDOS attack prevention using IP Tables and other security tools.

Hire Now