Hadoop Administrator Resume
Chicago, IL
PROFESSIONAL SUMMARY:
- Highly skilled Hadoop Administrator has strong abilities in administration of large data clusters in big data environments and is extremely analytical with excellent problem - solving skills.
- Has over 8 years of experience in IT industry and a Bachelor's Degree in Computer Science.
- Experience in deploying, configuring, supporting and managing Hadoop Clusters of Cloudera and Hortonworks distributions on Linux platform
- Proficient in working with Hadoop 1.x(HDFS and MapReduce) and Hadoop 2.x(HDFS, MapReduce, YARN) architectures
- Experience in installing and managing wide range of BigData components such as HDFS, Hive, HBase, PIG, OOZIE, Toad, Impala, Hue, Zookeeper, Flume, Kafka, Spark and Storm.
- Experience in implementing the Name Node High Availability(HA), Resource Manager High Availability(HA) for Hadoop clusters and designing automatic failover control using Zookeeper and Quorum Journal Nodes.
- Hands on practice in Implementing Hadoop security solutions such as Ranger, Sentry and Kerberos for securing Hadoop clusters and Data.
- Thorough knowledge in distributed scheduling to interact with data in multiple ways using YARN.
- Hands on experience in Tuning YARN components to achieve high performance in BigData Cluster.
- Screen Hadoop cluster job performances and Capacity Planning.
- Diligently teaming with the infrastructure, network, database, application and business intelligence teams to Guarantee high Data Quality and availability.
- Collaborating with application teams to install operating system and Hadoop Updates, Patches, Version Upgrades when required.
- Aligning with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to Expand existing environments.
- Experience in building Apache Kafka cluster and integrating with Apache Storm for real time data analysis.
- Hands on experience in administering Linux systems to deploy Hadoop clusters and monitor using Nagios, Ganglia, Ambari, Cloudera Manager and Zookeeper.
- Proficient in administering Hue WebUI to create and manage user spaces in HDFS.
- Solid experience in Pig and Hive administration and development.
- Well versed in Importing structured data from RDBMSs like MySQL, Oracle and Postgres into HDFS and Hive using SQOOP.
- Hands on experience in deploying and administering large scale distributed data storages using HBase.
- Familiar with writing Oozie workflows and Job Controllers for job automation.
- Sound knowledge in Name Node Federation.
- Expertise in Rack Topology and HDFS NFS gateway.
- Extensive knowledge on Hadoop Application Frameworks like Tez and Spark.
- Thorough knowledge in System administration of RedHat Enterprise Linux and SUSE Linux.
- Good Shell Scripting ability and solid experience in remote administration, hardware installation and maintenance, software integration and packaging, TCP/IP network administration and Web administration.
- Expertise in performance Analysis, kernel Tuning, Troubleshooting, and Debugging.
PROFESSIONAL EXPERIENCE:
Confidential -Chicago, IL
Hadoop Administrator
Responsibilities:
- Understood the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop ecosystem.
- Deployed Hadoop cluster of Hortonworks Distribution and installed ecosystem components: HDFS, YARN, Zookeeper, Hbase, Hive, MapReduce, Pig, Kafka, Storm and Spark in Linux servers using Ambari.
- Set up automated 24x7x365 monitoring and escalation infrastructure for Hadoop cluster using Nagios Core and Ambari.
- Designed and implemented Disaster Recovery Plan for Hadoop Clusters.
- Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing Zookeeper services.
- Integrated Hadoop cluster with Active Directory and enabled Kerberos for Authentication.
- Implemented Capacity schedulers on the Yarn Resource Manager to share the resources of the cluster for the MapReduce jobs given by the users.
- Set up Linux Users, and tested HDFS, Hive, Pig and MapReduce Access for the new users.
- Monitored Hadoop Jobs and Reviewed Logs of the failed jobs to debug the issues based on the errors.
- Optimized Hadoop clusters components: HDFS, Yarn, Hive, Kafka to achieve high performance.
- Worked with Linux server admin team in administering the server Hardware and operating system.
- Interacted with Networking team to improve bandwidth.
- Provided User, Platform and Application support on Hadoop Infrastructure.
- Applied Patches and Bug Fixes on Hadoop Cluster.
- Proactively involved in ongoing Maintenance, Support and Improvements in Hadoop clusters.
- Conducted Root Cause Analysis and resolved production problems and data issues.
- Performed Disk Space management to the users and groups in the cluster.
- Added Nodes to the cluster and Decommissioned nodes from the cluster whenever required.
- Managed cluster operations Remotely.
- Performed Backup and Recovery process in order to Upgrade Hadoop stack.
- Used Sqoop, Distcp utilities for data copying and for data migration.
- Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs such as MapReduce, Pig, Hive, and Sqoop as well as system specific jobs such as Java programs and Shell scripts.
- Installed Kafka cluster with separate nodes for brokers.
- Performed Kafka operations on regular basis.
- Proactively involved in ongoing Maintenance, Support and Improvements in Hadoop clusters.
- Monitored cluster stability, used tools to gather statistics and improved performance.
- Used Apache(TM) Tez, an extensible framework for building high performance batch and interactive data processing applications, on Pig and Hive jobs.
- Identified disk space bottlenecks and Installed Nagios Log Server and integrated it with the PRD cluster to aggregate service logs from multiple nodes and created dashboards for important service logs for better analyzation based on historical log data.
Environment: RedHat OS 6.7,HDP(2.3),HDFS,MapReduce,Tez,YARN,Pig,Hive,HBase,Sqoop,Flume,Oozie,Zookeeper,Ambari,Nagios Core, Nagios Log Server,Kafka,Spark, Storm,Kerberos, Ranger, Teradata,Oracle, Tidal, Toad.
Confidential, Oaks, PA
Hadoop Admin
Responsibilities:
- Planned for hardware and software installation on production cluster and communicated with multiple teams to get it done.
- Installed and configured Hadoop cluster of Cloudera Distribution(CDH 4.x) using Cloudera Manager and maintained their integrity.
- Integrated Hadoop cluster with Zookeeper cluster and achieved NameNode High Availability.
- Maintained Hadoop ecosystem security by Installing and configuring Kerberos.
- Created keytab files in KDC server for each and every service in Cloudera stack.
- Creating new user accounts and assigning pools for the application usage.
- End to end support in DEV and PRD clusters.
- Cluster maintenance using Cloudera Manager for adding nodes and decommission dead nodes.
- Balanced Hadoop cluster for storage consistency.
- Tuned YARN components to achieve high performance for MapReduce jobs.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Installed and worked on Hadoop ecosystem components Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
- End to end Data flow management from sources to Nosql (mongoDB) Database using Oozie.
- Implemented HDFS snapshot feature.
- Worked with application teams to install Hadoop Updates, Patches, version Upgrades as required.
- Designed, configured and managed the backup and disaster recovery for HDFS data.
- Experienced in production support which involves solving the user incidents varies from sev1 to sev5.
- Worked with big data developers, designers in troubleshooting MapReduce job failures and issues with Hive, Pig and Flume.
- Patching and Up gradation of cluster.
- Migrated structured data from multiple RDBMS servers to Hadoop platform using Sqoop.
- Familiar with Bucketing and Partitioning for Hive performance improvement.
- Orchestrated Sqoop scripts, pig scripts, hive queries using Oozie workflows and sub-workflows.
- Helped client organization shift more of their IT budgets from capital expense to operational expense.
Environment: RedHat (OS 6.x), CDH(4.x), HDFS, MapReduce, YARN, Pig, Hive, HBase, Sqoop, Flume, Impala, Oozie, Zookeeper, MySQL, mongoDB, Cloudera Manager, Nagios, Chef.
Confidential
LinuxSystem Administrator
Responsibilities:
- Installed RedHat Enterprise Linux (RHEL 6)on production servers.
- Provided Support to Production Servers.
- Updated firmware on Servers, Installed patches and packages for security vulnerabilities for Linux.
- Monitored system resources, like network, logs, disk usage etc.
- User account creation and account maintenance both local and centralized (LDAP - Sun Identity Manager).
- Performed all duties related to system administration like troubleshooting, providing sudo access, modifying DNS entries, NFS, backup recovery (scripts).
- Setup password less login using ssh public - private key.
- Setting up cron jobs for the application owners to deploy scripts on production servers.
- Performed check out for the sanity of the file systems and volume groups.
- Developed scripts for internal use for automation of some regular jobs using shell scripting.
- Completed Work Requests raised by customer/team and following up with them.
- Worked on Change Request raised by customer/team and follow up.
- Did Root Cause Analysis on Problem Tickets and frequently occurring incidents.
- Raised Case with vendors if any software or hardware needs to be updated/replaced/repaired.
- Raised Case with RedHat and follow up them as and when required.
- Engaged different teams member when ticket requires multiple team support.
- Effectively and Efficiently monitored SDM / Remedy queues so that no SLA Breach should happen.
- Worked in a 24X7 on call rotation to support critical production environments.
Environment: RedHat LINUX Release 5.x, 6.x,SUSE LINUX v 10.1, 11, OpenBSD,TCP/IP Wrapper, SSH, SCP, RSYNC,Service Desk Manager, BMC Remedy, Hostinfo, Apache Web Server, Samba Server,Iptables, FTP, DHCP, DNS, NFS, RPM, YUM, LDAP, Auto FS, LAN, WAN,KVM, RedHat Ent Virtualization, Xen, VMware.