Sr. Hadoop Administrator Resume
Atlanta, GA
SUMMARY:
- Over 8+ years of administration experience including 3 years of experience with Hadoop Ecosystem in installation and configuration of different Hadoop eco - system components in the existing cluster.
- Experience in Hadoop Administration (HDFS, MAP REDUCE,YARN, HIVE, PIG, SQOOP, FLUME AND OOZIE, HBASE) NoSQL Administration
- Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, RackSpace and OpenStack.
- Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
- Experience in installing Hadoop cluster using different distributions of Apache Hadoop, Cloudera and Hortonworks.
- Good Experience in understanding the client’s Big Data business requirements and transform it into Hadoop centric technologies.
- Analyzing the clients existing Hadoop infrastructure and understand the performance bottlenecks and provide the performance tuning accordingly.
- Installed, Configured and maintained HBASE
- Worked with Sqoop in Importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
- Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
- Experience in configuring Zookeeper to provide Cluster coordination services.
- Loading logs from multiple sources directly into HDFS using tools like Flume.
- Good experience in performing minor and major upgrades.
- Experience in benchmarking, performing backup and recovery of Namenode metadata and data residing in the cluster.
- Familiar in commissioning and decommissioning of nodes on Hadoop Cluster.
- Adept at configuring NameNode High Availability.
- Worked on Disaster Management with Hadoop Cluster.
- Worked with Puppet for application deployment.
- Fair understanding of Splunk
- Good underdstanding of Mesos.
- Well experienced in building servers like DHCP, PXE with kick-start, DNS and NFS and used them in building infrastructure in a Linux Environment.
- Experienced in Linux Administration tasks like IP Management (IP Addressing, Subnetting, Ethernet Bonding and Static IP).
- Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
- Experience in deploying and managing the multi-node development, testing and production
- Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing
- Principles, generating key tab file for each and every service and managing key tab using key tab tools.
- Worked on setting up Name Node high availability for major production cluster and designed Automatic failover control using zookeeper and quorum journal nodes.
- Good understanding of Big data and distributed technologies.
TECHNICAL SKILLS:
Hadoop Ecosystem: HDFS, Mapreduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, Zoo keeper, Cloudera Manager, Ambari
Security: Kerberos
Scripting Languages: Shell Scripting, Puppet
Monitoring Tools: Nagios, Ganglia, Cloudera Manager, Ambari
Operating Systems: Linux RHEL/Ubuntu/CentOS, Windows (XP/7/8)
PROFESSIONAL EXPERIENCE:
Confidential, Atlanta,GA
Sr. Hadoop Administrator
Responsibilities:
- Handled Ingestion from source files and databases.
- Handled data from Golden gate and Beam architecture.
- Golden gate for structured data.
- Connect enterprise and mount points for File copies.
- Handled data from multiple data sources.
- Handled data ingestion from Databases such as Oracle and Mysql
- Performed incremental and full load copy of database copies.
- Handled data ingestion from File copies such as Connect Enterprise and Mount points.
- Worked on Sqoop,Flume.
- Exported data to Teradata systems.
- Average number of jobs running are 900 per day.
- Worked on Hortonworks HDP 2.2
- Worked on hadoop cluster of 220 nodes.
- Handled 1.5 PB of data.
- Worked on 20 node QA cluster.
- Worked on 20 node Dev(non production environment).
- Configured Kerberos on the cluster.
- Worked on multiple sprints.
- Was assigned individual projects.
- Deployed more than 100 jobs.
- Debugged jobs as and when necessary.
- Worked on Hive,Yarn,Map-Reduce.
- Set up Oozie workflows.
- Perfomed OS Patching.
Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, HORTONWORKS.
Confidential, Redwood City,CA
Hadoop Administrator
Responsibilities:- Handled Hadoop Operations across 9 data centres.
- Managed 2500+ nodes
- Good understanding of production environment.
- Used Git Hub for code management.
- LDAP for authentication.
- Developed a tool for importing/scheduled import tables and views (calculation views/analytic views and attribute) from SAP HANA system to Hadoop.
- Used JIRA for task completion.
- Used Nagios for alerts.
- Used Opentsdb for monitoring.
- Used Puppet for code deployment.
- Performed Disk repairs and replacements.
- Monitored and managed HBASE clusters.
- Provided Optimization as and when necessary on OS and hardware level.
- Performed OS level upgrades.
- Performed Hardware benchmarking(CPU,Disk I/O)
- Configured Raid,Kickstarted servers through IPMI interface.
- Scheduled jobs in oozie
- Managed Mapred,Yarn jobs on multiple data centres.
- Monitored Master nodes and did de-bugging as and when necessary.
- Commisioning and De-commisioning of nodes.
- Working closely with the development team to have jobs done.
- Fine tuned the cluster as and when necessary.
- Worked on CLOUDERA and APACHE distributions.
- Good understanding of overall hadoop architecture.
- Worked on setting up high availability and designed automatic failover controller using zookeeper and Quorum journal nodes.
Environment: HADOOP HDFS, MAPREDUCE,SAP HANA, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, CLOUDERA,APACHE HADOOP.
Confidential,San Diego, CA
Hadoop Administrator
Responsibilities:- Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
- Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
- Worked on 150+ nodes.
- Provided Hadoop, OS, Hardware optimizations.
- Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
- Installed and configured Hortonworks Ambari for easy management of existing Hadoop cluster
- Performed Upgrade HDP 1.3 to HDP 2.2.
- Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes
- Implemented Fair scheduler on the job tracker to allocate fair amount of resources to small jobs.
- Performed operating system installation, Hadoop version updates using automation tools.
- Configured Oozie for workflow automation and coordination.
- Implemented rack awareness topology on the Hadoop cluster.
- Understanding the performance bottlenecks by analyzing the existing hadoop cluster and provided performance tuning accordingly.
- Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop
- Configured ZooKeeper to implement node coordination, in clustering support.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
- Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
- Worked on developing scripts for performing benchmarking with Terasort/Teragen.
- Implemented Kerberos Security Authentication protocol for existing cluster.
- Good experience in troubleshoot production level issues in the cluster and its functionality.
- Backed up data on regular basis to a remote cluster using distcp.
- Experience in Storage management including JBOD, RAID Levels 1 5 6 10, Logical Volumes, Volume Groups and Partitioning.
- Familiar with Java virtual machine (JVM) and multi-threaded processing
- Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
- Monitored and configured a test cluster on amazon web services for further testing process and gradual migration
Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, HORTONWORKS AMBARI
Confidential, Inc, OH
Hadoop Administrator
Responsibilities:
- Deployed a Hadoop cluster using cdh3 integrated with Nagios and Ganglia.
- Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuningf the Hadoop Cluster.
- Performed Installation and configuration of Hadoop Cluster of 90 Nodes with Cloudera distribution with CDH3.
- Installed Namenode, Secondary name node, Job Tracker, Data Node, Task tracker.
- Performed benchmarking and analysis using Test DFSIO and Terasort.
- Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
- Implemented Rack Awareness for data locality optimization.
- Dumped the data from MYSQL database to HDFS and vice-versa using SQOOP.
- Used Ganglia and Nagios to monitor the cluster around the clock.
- Created a local YUM repository for installing and updating packages.
- Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
- Implemented Name node backup using NFS.
- Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
- Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment
- Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
- Implemented Capacity schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Worked on importing and exporting Data into HDFS and HIVE using Sqoop.
- Worked on analyzing Data with HIVE and PIG
- Helped in setting up Rack topology in the cluster.
- Helped in the day to day support for operation.
- Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
- Upgraded the Hadoop cluster from cdh3 to cdh4.
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- Implemented Kerberos for authenticating all the services in Hadoop Cluster.
- Deployed Network file system for Name Node Metadata backup.
- Designed and allocated HDFS quotas for multiple groups.
- Configured and deployed hive metastore using MySQL and thrift server.
Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, CLOUDERA MANAGER
Confidential
Linux/MySQL Administrator
Responsibilities:
- Installation and configuration of Linux for new build environment.
- Installed Pre-Execution environment boot and Kickstart method on multiple servers, remote installation of Linux using PXE boot.
- Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
- Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.
- Deep understanding of monitoring and troubleshooting mission critical Linux machines.
- Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
- Ensured data recovery by implementing system and application level backups.
- Performed various configurations which include networking and IPTable, resolving host names and SSH keyless login.
- Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
- Automate administration tasks through the use of scripting and Job Scheduling using CRON.
- Installation and configuration of Linux for new build environment.
- Installing and maintaining the Linux servers
- Monitoring System Metrics and logs for any problems.
- Running cron-tab to back up data.
- Adding, removing, or updating user account information, resetting passwords, etc.
- Using Java Jdbc to load data into MySQL.
- Maintaining the MySQL server and Authentication to required users for databases.
- Creating and managing Logical volumes
- Installing and updating packages using YUM.
- Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
- Automate administration tasks through use of scripting and Job Scheduling using CRON.
ENVIRONMENT: MYSQL, PHP, SHELL SCRIPT, APACHE, LINUX .