We provide IT Staff Augmentation Services!

Hadoop Admin Resume

5.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY

  • Around 8 years of IT experience with 3 years of experience in administering Hadoop Ecosystem and 4 years of experience in Linux administering, Excellent understanding of Distributed Systems and Parallel Processing architecture.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters
  • Using Apache, CLOUDERA, HORTONWORKS, MapR distributions.
  • Strong knowledge on Hadoop HDFS architecture and Map - Reduce framework.
  • Involved in capacity planning for the Hadoop cluster in production.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
  • Experience in performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Architected and implemented automated server provisioning using puppet.
  • Experience in performing minor and major upgrades.
  • Experience in performing commissioning and decommissioning of data nodes on Hadoop cluster.
  • Strong knowledge in configuring Name Node High Availability and Name Node Federation.
  • Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
  • Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.
  • Implemented KNOX, RANGER in Hadoop cluster.
  • Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
  • Worked on disaster management with Hadoop cluster.
  • Built data transform framework using MapReduce and Pig.
  • Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.
  • Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Cloudera Manager and Hortonworks Ambari.
  • Supported MapReduce programs running on the cluster.
  • Manage and review Hadoop log files.
  • Done stress and performance testing, benchmark for the cluster.
  • Built ingestion framework using flume for streaming logs and aggregating the data into HDFS.
  • Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
  • Expertise in importing and exporting data into HDFS format.
  • Prototyped the proof-of-concept with Hadoop 2.0 (YARN).

TECHNICAL SKILLS

Hadoop Ecosystem Components: HDFS, MapReduce, Pig, Hive, Oozie, Sqoop, Flume & Zookeeper.

Cluster management tools: OpsCenter,Cloudera Manager, Ambari, Ganglia, Nagios

UNIX TOOLS: Apache, Yum, RPM

Languages and Technologies: Core Java, C, C++, and Data Structures, algorithms.

Operating Systems: Windows, Linux & UNIX.

Scripting Languages: Shell scripting, puppet.

Networking: TCP/IP Protocol, Switches & Routers, OSI Architecture, HTTP, NTP & NFS.

Databases: SQL &, NoSQL - Cassandra.

PROFESSIONAL EXPERIENCE

Confidential, Atlanta, GA

Hadoop Admin

Responsibilities:

  • Performed both Major and Minor upgrades to the existing cluster and also rolling back to the previous version.
  • Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Implemented Map Reduce jobs in HIVE by querying the available data.
  • Used Ganglia and Nagios to monitor the cluster around the clock.
  • Implemented NFS, NAS and HTTP servers on Linux servers.
  • Created a local YUM repository for installing and updating packages.
  • Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
  • Designed the shell script for backing up of important metadata.
  • HA implementation of Name Node to avoid single point of failure.
  • Implemented Name node backup using NFS. This was done for High availability.
  • Supported Data Analysts in running Map Reduce Programs.
  • Worked on analyzing data with Hive and Pig.
  • Running cron-tab to back up data.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Configured Oozie for workflow automation and coordination.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Maintained, audited and built new clusters for testing purposes using the AMBARI,HORTONWORKS
  • Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured IPTABLES rules to allow the connection of application servers to the cluster and also setup NFS exports list and blocked unwanted ports.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log Data from Many different sources to the HDFS.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
  • Responsible to manage data coming from different sources.

Environment: Map Reduce, HDFS, Hive, Pig, Flume, Sqoop, UNIX Shell Scripting, Nagios, Kerberos.

Confidential, Pittsburgh, PA

Hadoop Admin

Responsibilities:

  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
  • Performed various configurations which Includes, networking and iptable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
  • Implemented authentication and authorization service using Kerberos authentication protocol.
  • Performed benchmarking on the Hadoop cluster using different bench marking mechanisms.
  • Tuned the cluster by Commissioning and decommissioning the DataNodes.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Upgraded the Hadoop cluster from cdh3 to cdh4.
  • Major Upgrade from cdh4 to chd 5.2.
  • Deployed high availability on the Hadoop cluster quorum journal nodes.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for NameNode Meta data backup.
  • Performed a POC on cluster back using distcp, Cloudera manager BDR and parallel ingestion.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Development of Pig scripts for handling the raw data for analysis.
  • Maintained, audited and built new clusters for testing purposes using the CLOUDERA MANAGER
  • Deployed and configured flume agents to stream log events into HDFS for analysis.
  • Configured Oozie for workflow automation and coordination.
  • Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
  • Custom shell scripts for automating redundant tasks on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Defined Oozie workflow based on time to copy the data upon availability from different Sources to Hive.
  • Performed Stress and Performance testing, benchmark for the cluster.

Environment: Linux, Map Reduce, HDFS, Hive, Pig, Sqoop, Flume, Ganglia, Nagios, Kerberos.

Confidential

Linux Administrator

Responsibilities:

  • Installation and configuration of Linux for new build environment.
  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Monitoring the System activity, Performance, Resource utilization.
  • Develop and optimize physical design of MySQL database systems.
  • Automate administration tasks through use of scripting and Job Scheduling using CRON
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
  • Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administration tasks like cron jobs, installing packages, and patches.
  • Extensive use of LVM, creating Volume Groups, Logical volumes.
  • Performed RPM and YUM package installations, patch and other server management.
  • Performed scheduled backup and necessary restoration.
  • Performed configuration and troubleshooting of services like NFS, NIS, NIS+, DHCP, FTP, LDAP, Apache Web servers.
  • Managed critical bundles and patches on the production servers after successfully navigating through the testing phase in the test environments.
  • Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
  • Updating YUM Repository and Red hat Package Manager (RPM).
  • Configured Domain Name System (DNS) for hostname to IP resolution.
  • Preparation of operational testing scripts for Log check, Backup and recovery and Failover.
  • Troubleshooting and fixing the issues Confidential User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hour.

Environment: YUM, RAID, MYSQL 5.1.4, PHP, SHELL SCRIPT, MYSQL, WORKBENCH, LINUX 5.0, 5.1, YUM, RAID.

Confidential

Linux Administrator

Responsibilities:

  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Monitoring the System activity, Performance, Resource utilization.
  • Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administration tasks like cron jobs, installing packages, and patches.
  • Extensive use of LVM, creating Volume Groups, Logical volumes.
  • Performed RPM and YUM package installations, patch and other server management.
  • Performed scheduled backup and necessary restoration.
  • Configured Domain Name System (DNS) for hostname to IP resolution.
  • Troubleshooting and fixing the issues Confidential User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hour.

Environment: YUM, RAID, PHP, SHELL SCRIPT, MYSQL, WORKBENCH, LINUX 5.0, 5.1, LVM, DNS.

We'd love your feedback!