Hadoop Admin Resume Salt Lake City, UT - Hire IT People

SUMMARY

7 years of IT experience with 2.5+ years of experience in administering Hadoop Ecosystem and 3 plus years of experience in Linux administering.
Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, CLOUDERA, HORTONWORKS, MapR distributions.
Strong knowledge on Hadoop HDFS architecture and Map - Reduce framework.
Involved in capacity planning for the Hadoop cluster in production.
Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
Developed Map Reduce jobs.
Experience in using Hive Query Language for data Analytics.
Experience in performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
Architected and implemented automated server provisioning using puppet.
Experience in performing minor and major upgrades.
Experience in performing commissioning and decommissioning of data nodes on Hadoop cluster.
Strong knowledge in configuring Name Node High Availability and Name Node Federation.
Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.
Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principles, generation key tab file each service and managing keytab using keytab tools.
Implemented KNOX, RANGER in Hadoop cluster.
Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
Worked on disaster management with Hadoop cluster.
Built data transform framework using Map Reduce and Pig.
Experience in monitoring and managing 80 node Hadoop cluster.
Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.
Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Cloudera Manager and Hortonworks Ambari.
Supported Map Reduce programs running on the cluster.
Developed Map Reduce program for parsing and loading into HDFS information.
Manage and review Hadoop log files.
Done stress and performance testing, benchmark for the cluster.
Built ingestion framework using flume for streaming logs and aggregating the data into HDFS.
Hands-on experience with “Productionalizing” Hadoop applications such as administration, configuration management, debugging and performance tuning.
Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
Have knowledge on MapR
Prototyped the proof-of-concept with Hadoop 2.0 (YARN).

TECHNICAL SKILLS

Hadoop Ecosystem Components: HDFS, MapReduce, Pig, Hive, Oozie, Sqoop, Flume, Zookeeper, Splunk

Languages and Technologies: Core Java, C, C++, and Data Structures, algorithms.

Operating Systems: Windows, Linux & Unix.

Scripting Languages: Shell scripting, puppet.

Networking: TCP/IP Protocol, Switches & Routers, OSI Architecture, HTTP, NTP & NFS.

Databases: SQL &, NoSQL - Cassandra.

PROFESSIONAL EXPERIENCE

Confidential, Salt Lake City, UT

Hadoop Admin

Environment: Map Reduce, HDFS, Hive, Pig, Flume, Splunk, Sqoop, UNIX Shell Scripting, Nagios, Kerbero, Zookeeper, Java

Responsibilities:

Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
Performed various configurations which Includes, networking and iptable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
Involved in creating workflow to run multiple hive and Pig Jobs, which run independently with time and data availability.
Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Implemented authentication and authorization service using Kerberos authentication protocol.
Performed benchmarking on the Hadoop cluster using different bench marking mechanisms.
Transfer of data to hdfs using splunk.
Tuned the cluster by Commissioning and decommissioning the DataNodes.
Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Upgraded the Hadoop cluster from cdh3 to cdh4.
Major Upgrade from cdh4 to chd 5.2.
Deployed high availability on the Hadoop cluster quorum journal nodes.
Implemented automatic failover zookeeper and zookeeper failover controller.
Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
Implemented Kerberos for authenticating all the services in Hadoop Cluster.
Deployed Network file system for NameNode Meta data backup.
Involved in helping developers team.
Performed a POC on cluster back using distcp, Cloudera manager BDR and parallel ingestion.
Configured and deployed hive metastore using MySQL and thrift server.
Development of Pig scripts for handling the raw data for analysis.
Cluster co-ordination services through ZooKeeper
Maintained, audited and built new clusters for testing purposes using the Cloudera manager.
Deployed and configured flume agents to stream log events into HDFS for analysis.
Configured Oozie for workflow automation and coordination.
Developed Map Reduce programs for data analysis and data cleaning
Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
Custom shell scripts for automating redundant tasks on the cluster. involved in loading data from UNIX file system to HDFS.
Defined Oozie workflow based on time to copy the data upon availability from different Sources to Hive.

Confidential - Philadelphia, PA

Linux Administrator / Hadoop Admin

Environment: Linux, Map Reduce, HDFS, Hive, Pig, Sqoop, Flume, Ganglia, Nagios, Kerberos, Java

Responsibilities:

Performed both Major and Minor upgrades to the existing cluster and also rolling back to the previous version.
Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
Implemented Map Reduce jobs in HIVE by querying the available data.
Used Ganglia and Nagios to monitor the cluster around the clock.
Implemented NFS, NAS and HTTP servers on Linux servers.
Created a local YUM repository for installing and updating packages.
Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
Used Flume to push large amount of data from different source to HDFS.
Designed the shell script for backing up of important metadata.
HA implementation of Name Node to avoid single point of failure.
Designed the cluster so that only one Secondary name node daemon could be run at any given time.
Implemented Name node backup using NFS. This was done for High availability.
Supported Data Analysts in running Map Reduce Programs.
Worked on analyzing data with Hive and Pig.
Experienced in using Splunk
Running cron-tab to back up data.
Involved in creating UDF in java
Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
Implemented Kerberos for authenticating all the services in Hadoop Cluster.
Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
Designed and allocated HDFS quotas for multiple groups.
Configured IPTABLES rules to allow the connection of application servers to the cluster and also setup NFS exports list and blocked unwanted ports.
Configured Flume for efficiently collecting, aggregating and moving large amounts of log Data from Many different sources to the HDFS.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Responsible to manage data coming from different sources.

Confidential

Linux Administrator

Responsibilities:

Administration of RHEL4.x, 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts, Xen servers.
Installing RedHat Linux using kicks tart and applying security polices for hardening the server based on the company policies.
Installed and verified that all AIX/Linux patches or updates are applied to the servers.
Installing, administering RedHat using Xen, KVM based hypervisors.
RPM and YUM package installations, patch and other server management.
Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
Worked and performed data-center operations including rack mounting, cabling.
Installed, configured, and maintained Weblogic10.x and Oracle10g on Solaris & RedHat Linux.
Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, user and group quota.
Configuring multipath, adding SAN and creating physical volumes, volume groups, logical volumes.
Installing and configuring Apache and supporting them on Linux production servers.
Troubleshooting Linux network, security related issues, capturing packets using tools such as IPtables, firewall, TCP wrappers, NMAP.
Worked on resolving production issues and documenting Root Cause Analysis and updating the tickets using BMC Remedy.

We provide IT Staff Augmentation Services!

Hadoop Admin Resume

Salt Lake City, UT

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship