We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

Mclean, VA

SUMMARY

  • Having 8+ plus years of IT industrial experience in Administering Linux, Database management, developing Map - reduce applications, designing, building and administering large scale Hadoop production Clusters
  • 3 years of experience in big data technologies: Hadoop HDFS, Map-reduce, Tez, Pig,Yarn, Hive, Spark, Oozie, Flume, kafka, Sqoop, Zookeeper, And NoSQL: Cassandra and Hbase.
  • Experience in deploying and managing teh multi-node development, testing and production Hadoop cluster wif different Hadoop components (HIVE, Spark, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Hortonworks Ambari.
  • Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
  • Experience in administering teh Linux systems to deploy Hadoop cluster and monitoring teh cluster using Ambari Metrics.
  • Experience in benchmarking, performing backup and disaster recovery of Namenode metadata and important sensitive data residing on cluster.
  • Experience in performing minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
  • Strong knowledge in configuring Namenode High Availability and Namenode Federation.
  • Familiar wif writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
  • Familiar wif importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.
  • Experience in using Flume to stream data into HDFS - from various sources.
  • Hands on experience in provisioning and managing multi-tenant Hadoop clusters on public cloud environment - Amazon Web Services (AWS)-EC2.
  • Experience in installing and administering PXE Server wif kick start, setting up FTP, DHCP, DNS servers and Logical Volume Management.
  • Experience in configuring and managing storage devices NAS (file level access - NFS) and SAN (block level access-iSCSI)
  • Experience in Storage management including JBOD, RAID Levels 1 5 6 10, Logical Volumes, Volume Groups and Partitioning
  • Exposure to Maven/Ant, GIT along wif Shell Scripting for Build & Deployment Process.
  • Experience in maintain and distribute teh configuration files using PUPPET
  • Experience in understanding teh security requirements for Hadoop and integrating wif Kerberos autantication infrastructure- KDC server setup, crating realm /domain, managing principals, generation key tab file each service and managing key tab using key tab tools.
  • Experience in handling multiple relational databases: MySQL, SQL Server.
  • Familiar wif Agile Methodology (SCRUM) and Software Testing.
  • TEMPEffective problem-solving skills and outstanding interpersonal skills. Ability to work independently as well as wifin a team environment. Driven to meet deadlines. Ability to learn and use new technologies quickly.

TECHNICAL SKILLS

Hadoop Ecosystem: HDFS, SQOOP, FLUME, MAP-REDUCE, HIVE, PIG, OOZIEZOOKEEPER

NoSQL Database: Hbase, Cassandra

Security: Kerberos

Database: MySQL, SQL Server

Cluster management Tools: Cloudera Manager, Ambari

OS: LINUX (Centos, RHEL), windows, mac

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Administrator

Responsibilities:

  • Experienced in Build and maintain teh >14PB Production environment.
  • Installed and maintained 300+ node Hadoop clusters using Hortonworks Hadoop (HDP 2.2, HDP 2.3, HDP 2.5 and HDP 2.6)
  • Monitored workload, job performance and capacity planning using Ambari.
  • Performed both major and minor upgrades to teh existing Ambari Hadoop cluster.
  • Integrated Hadoop wif Active Directory and enabled Kerberos, Knox for Autantication.
  • Applied patches and bug fixes on Hadoop Clusters.
  • Performance tuned and optimized Hadoop clusters to achieve high performance.
  • Implemented capacity scheduler on teh yarn to share teh resources of teh cluster for teh map reduces/tez jobs given by teh users.
  • Monitoring Hadoop Clusters using Ambari UI
  • Expertise in implementation and designing of disaster recovery plan for Hadoop Cluster.
  • Extensive hands on experience in Hadoop file system commands for file handling operations.
  • Worked on Integration of Hiveserver2 wif Tableau AND Informatica.
  • Worked on Providing User support and application support on Hadoop Infrastructure.
  • Involved in business requirements gathering and analysis of business use cases.
  • Understanding teh existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP.
  • Add new hosts and Balanced teh data on teh new data nodes by running teh HDFS Balancer.
  • Running shell scripts through cron job to monitor and alert admins about teh bad jobs.
  • Add non ambari managed ETL hosts to teh hadoop environment and deploy teh configs using PUPPET.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Imported logs from web servers wif Flume to ingest teh data into HDFS.
  • Experienced in deploying and maintaining teh ambari views.
  • Experienced in fine tune Hbase service and jobs fine tuening.
  • Fine tuning hive jobs for optimized performance
  • Fine tuning Hive jobs for better performance.
  • Involved in extracting teh data from various sources into Hadoop HDFS for processing.
  • Fine tuning teh Spark jobs for better utilization of teh resources.
  • TEMPEffectively used Sqoop to transfer data between databases and HDFS.
  • Teh Hive tables created as per requirement were internal or external tables defined wif appropriate static and dynamic partitions, intended for efficiency.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.

Environment: HDFS, SQOOP, FLUME, MAP-REDUCE, HIVE, PIG, OOZIE, ZOOKEEPER, HBASE, Manager

Confidential, McLean, VA

Hadoop Administrator

Responsibilities:

  • Installed and maintained 100+ node Hadoop clusters using HDP
  • Performed both major and minor upgrades to teh existing Ambari Hadoop cluster.
  • Applied patches and bug fixes on Hadoop Clusters.
  • Performance tuned and optimized Hadoop clusters to achieve high performance.
  • Monitoring Hadoop Clusters using Ambari UI
  • Expertise in implementation and designing of disaster recovery plan for Hadoop Cluster.
  • Extensive hands on experience in Hadoop file system commands for file handling operations.
  • Involved in business requirements gathering and analysis of business use cases.
  • Understanding teh existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP.
  • Add new hosts and Balanced teh data on teh new data nodes by running teh HDFS Balancer.
  • Running shell scripts through cron job to monitor and alert admins about teh bad jobs.
  • Worked wif developers to fine tuning Hive jobs for better performance.
  • Involved in extracting teh data from various sources into Hadoop HDFS for processing.
  • TEMPEffectively used Sqoop to transfer data between databases and HDFS.
  • Teh Hive tables created as per requirement were internal or external tables defined wif appropriate static and dynamic partitions, intended for efficiency.

Environment: Hive, HBase, Map Reduce, Oozie, Sqoop, MySQL, PL/SQL, Linux, HDP.

Confidential, San Jose, CA

Hadoop Administrator

Responsibilities:

  • Involved in start to end process of hadoop cluster setup where in installation, configuration and monitoring teh Hadoop Cluster.
  • Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
  • Monitoring systems and services, architecture design and implementation of hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon teh job requirement
  • Importing and exporting data into HDFS using Sqoop.
  • Experienced in define being job flows wif Oozie.
  • Loading log data directly into HDFS using Flume.
  • Experienced in managing and reviewing Hadoop log files.
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Installation and configuration of Sqoop and Flume, Hbase
  • Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • As a admin followed standard Back up policies to make sure teh high availability of cluster.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented teh systems processes and procedures for future references.
  • Worked wif systems engineering team to plan and deploy new hadoop environments and expand existing hadoop clusters.
  • Monitored multiple hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.

Environment: HDFS, Pig, Hive, HBase, Sqoop, Spark, Oozie, Sqoop, flume, Kafka, AWS, Linux Shell Scripting.

Confidential, Salt Lake City

Linux Admin/ Hadoop Admin

Responsibilities:

  • Worked wif teh Linux administration team to prepare and configure teh systems to support Hadoop deployment.
  • Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
  • Implemented autantication service using Kerberos autantication protocol.
  • Created volume groups, logical volumes and partitions on teh Linux servers and mounted file systems on teh created partitions.
  • Master nodes disks are configured wif RAID 1+0
  • Performed benchmarking on teh Hadoop cluster using different benchmarking mechanisms.
  • Tuned teh cluster by Commissioning and decommissioning teh Data Nodes.
  • Upgraded teh Hadoop cluster.
  • Implemented Fair scheduler on teh job tracker to allocate teh fair amount of resources to small jobs.
  • Deployed high availability on teh Hadoop cluster quorum journal nodes.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Configured Ganglia which include installing GMOND and GMETAD daemons which collects all teh metrics running on teh distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Implemented Kerberos for autanticating all teh services in Hadoop Cluster.
  • Deployed Network file system for Name Node Metadata backup.
  • Performed cluster back using DISTCP, Cloudera manager BDR and parallel ingestion.
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Used hive schema to create relations in pig using Hcatalog.
  • Development of Pig scripts for handling teh raw data for analysis.
  • Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
  • Deployed and configured flume agents to stream log events into HDFS for analysis.
  • Performed deploying yarn, which facilitate multiple applications to run on teh cluster.
  • Configured Oozie for workflow automation and coordination.
  • Custom monitoring scripts for Nagios to monitor teh daemons and teh cluster status.
  • Custom shell scripts for automating redundant tasks on teh cluster.
  • Worked wif BI teams in generating teh reports and designing ETL workflows on Pentaho.

Environment: LINUX, HDFS, SQOOP, FLUME, MAP-REDUCE, HIVE, PIG, OOZIE, ZOOKEEPER

Confidential

Linux System Administrator

Responsibilities:

  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot
  • Created groups, added Users ID to a group as a primary or secondary group, removing Users ID from a group as well as adding users in Sudoers file
  • Monitoring teh System activity, Performance, Resource utilization.
  • Use RPM to install, update, verify, query and erase packages from linux Servers
  • Extensive use of LVM, creating Volume Groups, Logical volumes.
  • Worked on mounting teh file-systems using AutoFS and configuring fstab file
  • Performed RPM and YUM package installations, patch and other server management.
  • Performed scheduled backup and necessary restoration.
  • Configured NFS
  • Developed Shell Scripts for automation of daily tasks
  • Setting up cron schedules for backups and monitoring processes
  • Configured Domain Name System (DNS) for hostname to IP resolution
  • Troubleshooting and fixing teh issues at User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hours

Environment: LINUX (Centos/RHEL)

We'd love your feedback!