We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY:

  • Over 8+ years of administration experience including 3 years of experience with Hadoop Ecosystem in installation and configuration of different Hadoop eco - system components in the existing cluster.
  • Experience in Hadoop Administration (HDFS, MAP REDUCE,YARN, HIVE, PIG, SQOOP, FLUME AND OOZIE, HBASE) NoSQL Administration
  • Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, RackSpace and OpenStack.
  • Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
  • Experience in installing Hadoop cluster using different distributions of Apache Hadoop, Cloudera and Hortonworks.
  • Good Experience in understanding the client’s Big Data business requirements and transform it into Hadoop centric technologies.
  • Analyzing the clients existing Hadoop infrastructure and understand the performance bottlenecks and provide the performance tuning accordingly.
  • Installed, Configured and maintained HBASE
  • Worked with Sqoop in Importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
  • Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
  • Experience in configuring Zookeeper to provide Cluster coordination services.
  • Loading logs from multiple sources directly into HDFS using tools like Flume.
  • Good experience in performing minor and major upgrades.
  • Experience in benchmarking, performing backup and recovery of Namenode metadata and data residing in the cluster.
  • Familiar in commissioning and decommissioning of nodes on Hadoop Cluster.
  • Adept at configuring NameNode High Availability.
  • Worked on Disaster Management with Hadoop Cluster.
  • Worked with Puppet for application deployment.
  • Fair understanding of Splunk
  • Good underdstanding of Mesos.
  • Well experienced in building servers like DHCP, PXE with kick-start, DNS and NFS and used them in building infrastructure in a Linux Environment.
  • Experienced in Linux Administration tasks like IP Management (IP Addressing, Subnetting, Ethernet Bonding and Static IP).
  • Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
  • Experience in deploying and managing the multi-node development, testing and production
  • Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing
  • Principles, generating key tab file for each and every service and managing key tab using key tab tools.
  • Worked on setting up Name Node high availability for major production cluster and designed Automatic failover control using zookeeper and quorum journal nodes.
  • Good understanding of Big data and distributed technologies.

TECHNICAL SKILLS:

Hadoop Ecosystem: HDFS, Mapreduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, Zoo keeper, Cloudera Manager, Ambari

Security: Kerberos

Scripting Languages: Shell Scripting, Puppet

Monitoring Tools: Nagios, Ganglia, Cloudera Manager, Ambari

Operating Systems: Linux RHEL/Ubuntu/CentOS, Windows (XP/7/8)

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta,GA

Sr. Hadoop Administrator

Responsibilities:

  • Handled Ingestion from source files and databases.
  • Handled data from Golden gate and Beam architecture.
  • Golden gate for structured data.
  • Connect enterprise and mount points for File copies.
  • Handled data from multiple data sources.
  • Handled data ingestion from Databases such as Oracle and Mysql
  • Performed incremental and full load copy of database copies.
  • Handled data ingestion from File copies such as Connect Enterprise and Mount points.
  • Worked on Sqoop,Flume.
  • Exported data to Teradata systems.
  • Average number of jobs running are 900 per day.
  • Worked on Hortonworks HDP 2.2
  • Worked on hadoop cluster of 220 nodes.
  • Handled 1.5 PB of data.
  • Worked on 20 node QA cluster.
  • Worked on 20 node Dev(non production environment).
  • Configured Kerberos on the cluster.
  • Worked on multiple sprints.
  • Was assigned individual projects.
  • Deployed more than 100 jobs.
  • Debugged jobs as and when necessary.
  • Worked on Hive,Yarn,Map-Reduce.
  • Set up Oozie workflows.
  • Perfomed OS Patching.

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, HORTONWORKS.

Confidential, Redwood City,CA

Hadoop Administrator

Responsibilities:
  • Handled Hadoop Operations across 9 data centres.
  • Managed 2500+ nodes
  • Good understanding of production environment.
  • Used Git Hub for code management.
  • LDAP for authentication.
  • Developed a tool for importing/scheduled import tables and views (calculation views/analytic views and attribute) from SAP HANA system to Hadoop.
  • Used JIRA for task completion.
  • Used Nagios for alerts.
  • Used Opentsdb for monitoring.
  • Used Puppet for code deployment.
  • Performed Disk repairs and replacements.
  • Monitored and managed HBASE clusters.
  • Provided Optimization as and when necessary on OS and hardware level.
  • Performed OS level upgrades.
  • Performed Hardware benchmarking(CPU,Disk I/O)
  • Configured Raid,Kickstarted servers through IPMI interface.
  • Scheduled jobs in oozie
  • Managed Mapred,Yarn jobs on multiple data centres.
  • Monitored Master nodes and did de-bugging as and when necessary.
  • Commisioning and De-commisioning of nodes.
  • Working closely with the development team to have jobs done.
  • Fine tuned the cluster as and when necessary.
  • Worked on CLOUDERA and APACHE distributions.
  • Good understanding of overall hadoop architecture.
  • Worked on setting up high availability and designed automatic failover controller using zookeeper and Quorum journal nodes.

Environment: HADOOP HDFS, MAPREDUCE,SAP HANA, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, CLOUDERA,APACHE HADOOP.

Confidential,San Diego, CA

Hadoop Administrator

Responsibilities:
  • Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
  • Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
  • Worked on 150+ nodes.
  • Provided Hadoop, OS, Hardware optimizations.
  • Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
  • Installed and configured Hortonworks Ambari for easy management of existing Hadoop cluster
  • Performed Upgrade HDP 1.3 to HDP 2.2.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes
  • Implemented Fair scheduler on the job tracker to allocate fair amount of resources to small jobs.
  • Performed operating system installation, Hadoop version updates using automation tools.
  • Configured Oozie for workflow automation and coordination.
  • Implemented rack awareness topology on the Hadoop cluster.
  • Understanding the performance bottlenecks by analyzing the existing hadoop cluster and provided performance tuning accordingly.
  • Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop
  • Configured ZooKeeper to implement node coordination, in clustering support.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
  • Worked on developing scripts for performing benchmarking with Terasort/Teragen.
  • Implemented Kerberos Security Authentication protocol for existing cluster.
  • Good experience in troubleshoot production level issues in the cluster and its functionality.
  • Backed up data on regular basis to a remote cluster using distcp.
  • Experience in Storage management including JBOD, RAID Levels 1 5 6 10, Logical Volumes, Volume Groups and Partitioning.
  • Familiar with Java virtual machine (JVM) and multi-threaded processing
  • Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
  • Monitored and configured a test cluster on amazon web services for further testing process and gradual migration

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, HORTONWORKS AMBARI

Confidential, Inc, OH

Hadoop Administrator

Responsibilities:

  • Deployed a Hadoop cluster using cdh3 integrated with Nagios and Ganglia.
  • Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuningf the Hadoop Cluster.
  • Performed Installation and configuration of Hadoop Cluster of 90 Nodes with Cloudera distribution with CDH3.
  • Installed Namenode, Secondary name node, Job Tracker, Data Node, Task tracker.
  • Performed benchmarking and analysis using Test DFSIO and Terasort.
  • Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Implemented Rack Awareness for data locality optimization.
  • Dumped the data from MYSQL database to HDFS and vice-versa using SQOOP.
  • Used Ganglia and Nagios to monitor the cluster around the clock.
  • Created a local YUM repository for installing and updating packages.
  • Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
  • Implemented Name node backup using NFS.
  • Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment
  • Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
  • Implemented Capacity schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Worked on importing and exporting Data into HDFS and HIVE using Sqoop.
  • Worked on analyzing Data with HIVE and PIG
  • Helped in setting up Rack topology in the cluster.
  • Helped in the day to day support for operation.
  • Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
  • Upgraded the Hadoop cluster from cdh3 to cdh4.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for Name Node Metadata backup.
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured and deployed hive metastore using MySQL and thrift server.

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, CLOUDERA MANAGER

Confidential

Linux/MySQL Administrator

Responsibilities:

  • Installation and configuration of Linux for new build environment.
  • Installed Pre-Execution environment boot and Kickstart method on multiple servers, remote installation of Linux using PXE boot.
  • Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
  • Ensured data recovery by implementing system and application level backups.
  • Performed various configurations which include networking and IPTable, resolving host names and SSH keyless login.
  • Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
  • Automate administration tasks through the use of scripting and Job Scheduling using CRON.
  • Installation and configuration of Linux for new build environment.
  • Installing and maintaining the Linux servers
  • Monitoring System Metrics and logs for any problems.
  • Running cron-tab to back up data.
  • Adding, removing, or updating user account information, resetting passwords, etc.
  • Using Java Jdbc to load data into MySQL.
  • Maintaining the MySQL server and Authentication to required users for databases.
  • Creating and managing Logical volumes
  • Installing and updating packages using YUM.
  • Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
  • Automate administration tasks through use of scripting and Job Scheduling using CRON.

ENVIRONMENT: MYSQL, PHP, SHELL SCRIPT, APACHE, LINUX .

We'd love your feedback!