Hadoop Admin Resume
Atlanta, GA
SUMMARY
- Around 8 years of IT experience with 3 years of experience in administering Hadoop Ecosystem and 4 years of experience in Linux administering, Excellent understanding of Distributed Systems and Parallel Processing architecture.
- Hands on experience in installation, configuration, supporting and managing Hadoop Clusters
- Using Apache, CLOUDERA, HORTONWORKS, MapR distributions.
- Strong knowledge on Hadoop HDFS architecture and Map - Reduce framework.
- Involved in capacity planning for the Hadoop cluster in production.
- Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
- Experience in performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
- Architected and implemented automated server provisioning using puppet.
- Experience in performing minor and major upgrades.
- Experience in performing commissioning and decommissioning of data nodes on Hadoop cluster.
- Strong knowledge in configuring Name Node High Availability and Name Node Federation.
- Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
- Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.
- Implemented KNOX, RANGER in Hadoop cluster.
- Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
- Worked on disaster management with Hadoop cluster.
- Built data transform framework using MapReduce and Pig.
- Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.
- Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Cloudera Manager and Hortonworks Ambari.
- Supported MapReduce programs running on the cluster.
- Manage and review Hadoop log files.
- Done stress and performance testing, benchmark for the cluster.
- Built ingestion framework using flume for streaming logs and aggregating the data into HDFS.
- Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
- Expertise in importing and exporting data into HDFS format.
- Prototyped the proof-of-concept with Hadoop 2.0 (YARN).
TECHNICAL SKILLS
Hadoop Ecosystem Components: HDFS, MapReduce, Pig, Hive, Oozie, Sqoop, Flume & Zookeeper.
Cluster management tools: OpsCenter,Cloudera Manager, Ambari, Ganglia, Nagios
UNIX TOOLS: Apache, Yum, RPM
Languages and Technologies: Core Java, C, C++, and Data Structures, algorithms.
Operating Systems: Windows, Linux & UNIX.
Scripting Languages: Shell scripting, puppet.
Networking: TCP/IP Protocol, Switches & Routers, OSI Architecture, HTTP, NTP & NFS.
Databases: SQL &, NoSQL - Cassandra.
PROFESSIONAL EXPERIENCE
Confidential, Atlanta, GA
Hadoop Admin
Responsibilities:
- Performed both Major and Minor upgrades to the existing cluster and also rolling back to the previous version.
- Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
- Implemented Map Reduce jobs in HIVE by querying the available data.
- Used Ganglia and Nagios to monitor the cluster around the clock.
- Implemented NFS, NAS and HTTP servers on Linux servers.
- Created a local YUM repository for installing and updating packages.
- Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
- Designed the shell script for backing up of important metadata.
- HA implementation of Name Node to avoid single point of failure.
- Implemented Name node backup using NFS. This was done for High availability.
- Supported Data Analysts in running Map Reduce Programs.
- Worked on analyzing data with Hive and Pig.
- Running cron-tab to back up data.
- Implemented automatic failover zookeeper and zookeeper failover controller.
- Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
- Configured Oozie for workflow automation and coordination.
- Implemented Kerberos for authenticating all the services in Hadoop Cluster.
- Maintained, audited and built new clusters for testing purposes using the AMBARI,HORTONWORKS
- Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
- Designed and allocated HDFS quotas for multiple groups.
- Configured IPTABLES rules to allow the connection of application servers to the cluster and also setup NFS exports list and blocked unwanted ports.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log Data from Many different sources to the HDFS.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
- Responsible to manage data coming from different sources.
Environment: Map Reduce, HDFS, Hive, Pig, Flume, Sqoop, UNIX Shell Scripting, Nagios, Kerberos.
Confidential, Pittsburgh, PA
Hadoop Admin
Responsibilities:
- Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
- Performed various configurations which Includes, networking and iptable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
- Implemented authentication and authorization service using Kerberos authentication protocol.
- Performed benchmarking on the Hadoop cluster using different bench marking mechanisms.
- Tuned the cluster by Commissioning and decommissioning the DataNodes.
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- Upgraded the Hadoop cluster from cdh3 to cdh4.
- Major Upgrade from cdh4 to chd 5.2.
- Deployed high availability on the Hadoop cluster quorum journal nodes.
- Implemented automatic failover zookeeper and zookeeper failover controller.
- Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
- Implemented Kerberos for authenticating all the services in Hadoop Cluster.
- Deployed Network file system for NameNode Meta data backup.
- Performed a POC on cluster back using distcp, Cloudera manager BDR and parallel ingestion.
- Configured and deployed hive metastore using MySQL and thrift server.
- Development of Pig scripts for handling the raw data for analysis.
- Maintained, audited and built new clusters for testing purposes using the CLOUDERA MANAGER
- Deployed and configured flume agents to stream log events into HDFS for analysis.
- Configured Oozie for workflow automation and coordination.
- Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
- Custom shell scripts for automating redundant tasks on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Defined Oozie workflow based on time to copy the data upon availability from different Sources to Hive.
- Performed Stress and Performance testing, benchmark for the cluster.
Environment: Linux, Map Reduce, HDFS, Hive, Pig, Sqoop, Flume, Ganglia, Nagios, Kerberos.
Confidential
Linux Administrator
Responsibilities:
- Installation and configuration of Linux for new build environment.
- Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
- Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.
- Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
- Monitoring the System activity, Performance, Resource utilization.
- Develop and optimize physical design of MySQL database systems.
- Automate administration tasks through use of scripting and Job Scheduling using CRON
- Deep understanding of monitoring and troubleshooting mission critical Linux machines.
- Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
- Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administration tasks like cron jobs, installing packages, and patches.
- Extensive use of LVM, creating Volume Groups, Logical volumes.
- Performed RPM and YUM package installations, patch and other server management.
- Performed scheduled backup and necessary restoration.
- Performed configuration and troubleshooting of services like NFS, NIS, NIS+, DHCP, FTP, LDAP, Apache Web servers.
- Managed critical bundles and patches on the production servers after successfully navigating through the testing phase in the test environments.
- Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
- Updating YUM Repository and Red hat Package Manager (RPM).
- Configured Domain Name System (DNS) for hostname to IP resolution.
- Preparation of operational testing scripts for Log check, Backup and recovery and Failover.
- Troubleshooting and fixing the issues Confidential User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hour.
Environment: YUM, RAID, MYSQL 5.1.4, PHP, SHELL SCRIPT, MYSQL, WORKBENCH, LINUX 5.0, 5.1, YUM, RAID.
Confidential
Linux Administrator
Responsibilities:
- Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
- Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
- Monitoring the System activity, Performance, Resource utilization.
- Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administration tasks like cron jobs, installing packages, and patches.
- Extensive use of LVM, creating Volume Groups, Logical volumes.
- Performed RPM and YUM package installations, patch and other server management.
- Performed scheduled backup and necessary restoration.
- Configured Domain Name System (DNS) for hostname to IP resolution.
- Troubleshooting and fixing the issues Confidential User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hour.
Environment: YUM, RAID, PHP, SHELL SCRIPT, MYSQL, WORKBENCH, LINUX 5.0, 5.1, LVM, DNS.