We provide IT Staff Augmentation Services!

Hadoop Admin Resume

2.00/5 (Submit Your Rating)

Salt Lake City, UT

SUMMARY

  • 7 years of IT experience with 2.5+ years of experience in administering Hadoop Ecosystem and 3 plus years of experience in Linux administering.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, CLOUDERA, HORTONWORKS, MapR distributions.
  • Strong knowledge on Hadoop HDFS architecture and Map - Reduce framework.
  • Involved in capacity planning for the Hadoop cluster in production.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
  • Developed Map Reduce jobs.
  • Experience in using Hive Query Language for data Analytics.
  • Experience in performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Architected and implemented automated server provisioning using puppet.
  • Experience in performing minor and major upgrades.
  • Experience in performing commissioning and decommissioning of data nodes on Hadoop cluster.
  • Strong knowledge in configuring Name Node High Availability and Name Node Federation.
  • Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
  • Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.
  • Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principles, generation key tab file each service and managing keytab using keytab tools.
  • Implemented KNOX, RANGER in Hadoop cluster.
  • Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
  • Worked on disaster management with Hadoop cluster.
  • Built data transform framework using Map Reduce and Pig.
  • Experience in monitoring and managing 80 node Hadoop cluster.
  • Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.
  • Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Cloudera Manager and Hortonworks Ambari.
  • Supported Map Reduce programs running on the cluster.
  • Developed Map Reduce program for parsing and loading into HDFS information.
  • Manage and review Hadoop log files.
  • Done stress and performance testing, benchmark for the cluster.
  • Built ingestion framework using flume for streaming logs and aggregating the data into HDFS.
  • Hands-on experience with “Productionalizing” Hadoop applications such as administration, configuration management, debugging and performance tuning.
  • Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
  • Have knowledge on MapR
  • Prototyped the proof-of-concept with Hadoop 2.0 (YARN).

TECHNICAL SKILLS

Hadoop Ecosystem Components: HDFS, MapReduce, Pig, Hive, Oozie, Sqoop, Flume, Zookeeper, Splunk

Languages and Technologies: Core Java, C, C++, and Data Structures, algorithms.

Operating Systems: Windows, Linux & Unix.

Scripting Languages: Shell scripting, puppet.

Networking: TCP/IP Protocol, Switches & Routers, OSI Architecture, HTTP, NTP & NFS.

Databases: SQL &, NoSQL - Cassandra.

PROFESSIONAL EXPERIENCE

Confidential, Salt Lake City, UT

Hadoop Admin

Environment: Map Reduce, HDFS, Hive, Pig, Flume, Splunk, Sqoop, UNIX Shell Scripting, Nagios, Kerbero, Zookeeper, Java

Responsibilities:

  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
  • Performed various configurations which Includes, networking and iptable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
  • Involved in creating workflow to run multiple hive and Pig Jobs, which run independently with time and data availability.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Implemented authentication and authorization service using Kerberos authentication protocol.
  • Performed benchmarking on the Hadoop cluster using different bench marking mechanisms.
  • Transfer of data to hdfs using splunk.
  • Tuned the cluster by Commissioning and decommissioning the DataNodes.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Upgraded the Hadoop cluster from cdh3 to cdh4.
  • Major Upgrade from cdh4 to chd 5.2.
  • Deployed high availability on the Hadoop cluster quorum journal nodes.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for NameNode Meta data backup.
  • Involved in helping developers team.
  • Performed a POC on cluster back using distcp, Cloudera manager BDR and parallel ingestion.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Development of Pig scripts for handling the raw data for analysis.
  • Cluster co-ordination services through ZooKeeper
  • Maintained, audited and built new clusters for testing purposes using the Cloudera manager.
  • Deployed and configured flume agents to stream log events into HDFS for analysis.
  • Configured Oozie for workflow automation and coordination.
  • Developed Map Reduce programs for data analysis and data cleaning
  • Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
  • Custom shell scripts for automating redundant tasks on the cluster. involved in loading data from UNIX file system to HDFS.
  • Defined Oozie workflow based on time to copy the data upon availability from different Sources to Hive.

Confidential - Philadelphia, PA

Linux Administrator / Hadoop Admin

Environment: Linux, Map Reduce, HDFS, Hive, Pig, Sqoop, Flume, Ganglia, Nagios, Kerberos, Java

Responsibilities:

  • Performed both Major and Minor upgrades to the existing cluster and also rolling back to the previous version.
  • Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Implemented Map Reduce jobs in HIVE by querying the available data.
  • Used Ganglia and Nagios to monitor the cluster around the clock.
  • Implemented NFS, NAS and HTTP servers on Linux servers.
  • Created a local YUM repository for installing and updating packages.
  • Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
  • Used Flume to push large amount of data from different source to HDFS.
  • Designed the shell script for backing up of important metadata.
  • HA implementation of Name Node to avoid single point of failure.
  • Designed the cluster so that only one Secondary name node daemon could be run at any given time.
  • Implemented Name node backup using NFS. This was done for High availability.
  • Supported Data Analysts in running Map Reduce Programs.
  • Worked on analyzing data with Hive and Pig.
  • Experienced in using Splunk
  • Running cron-tab to back up data.
  • Involved in creating UDF in java
  • Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured IPTABLES rules to allow the connection of application servers to the cluster and also setup NFS exports list and blocked unwanted ports.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log Data from Many different sources to the HDFS.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from different sources.

Confidential

Linux Administrator

Responsibilities:

  • Administration of RHEL4.x, 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts, Xen servers.
  • Installing RedHat Linux using kicks tart and applying security polices for hardening the server based on the company policies.
  • Installed and verified that all AIX/Linux patches or updates are applied to the servers.
  • Installing, administering RedHat using Xen, KVM based hypervisors.
  • RPM and YUM package installations, patch and other server management.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Worked and performed data-center operations including rack mounting, cabling.
  • Installed, configured, and maintained Weblogic10.x and Oracle10g on Solaris & RedHat Linux.
  • Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, user and group quota.
  • Configuring multipath, adding SAN and creating physical volumes, volume groups, logical volumes.
  • Installing and configuring Apache and supporting them on Linux production servers.
  • Troubleshooting Linux network, security related issues, capturing packets using tools such as IPtables, firewall, TCP wrappers, NMAP.
  • Worked on resolving production issues and documenting Root Cause Analysis and updating the tickets using BMC Remedy.

We'd love your feedback!