Hadoop Administrator Resume Atlanta, GA - Hire IT People

SUMMARY

A qualified Technocrat and a seasoned professional offering 7+ years of IT experience in Administration and Product Development.
2.5 years of experience in big data technologies: Hadoop HDFS, Hive, Oozie, Flume, Hcatalog, Sqoop, Zookeeper, NoSQL: Cassandra and Hbase.
Experience with complete Software Design Lifecycle including design, development, testing and implementation of moderate to advanced complex systems.
Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloud era, HortonWorks, MapR and Pivotal Distributions.
Backup configuration and Recovery from a Name Node failure.
Decommissioning and commissioning the Node on running Hadoop cluster.
Installation of various Hadoop Ecosystems and Hadoop Daemons.
Installation and configuration of Sqoop and Flume.
Experience in deploying and managing the multi - node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME,HCATALOG, ZOOKEEPER) using Cloudera Manager.
Experience in understanding the security requirements for Hadoop and integrating with
Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing principles, generating key tab file for each and every service and managing key tab using key tab tools
Worked on setting up Name Node high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes
Experience in Importing and exporting data from different databases like MySQL into HDFS and Hive using Sqoop.
Worked on streaming the data into HDFS from web servers using flume.
Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
Scheduling all hadoop/hive/sqoop/Hbase jobs using Oozie.
Rack aware configuration for quick availability and processing of data.
Handsome experience in Linux admin activities.

TECHNICAL SKILLS

Hadoop Ecosystem: HDFS, Mapreduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, Zoo Keeper, Cloudera Manager.

Security: Kerberos

Programming Languages: Java, C#, C, SQL, Java Script, HTML

Scripting Languages: Shell Scripting, Puppet

IDE Tools: Eclipse, NetBeans, Visual Studio, Microsoft SQL Server, MS Office

Monitoring Tools: Nagios, Ganglia, Cloudera Manager.

Operating Systems: Linux RHEL/Ubuntu/CentOS, Windows (XP/7/8)

Virtualization technologies: VMware vSphere, Citrix XenServer

PROFESSIONAL EXPERIENCE

Confidential - Atlanta, GA

Hadoop Administrator

Responsibilities:

Installation of various Hadoop Ecosystems and Hadoop Daemons.
Installation and configuration of Sqoop and Flume, Hbase
Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
As a admin followed standard Back up policies to make sure the high availability of cluster.
Decommissioning and commissioning the Node on running Hadoop cluster.
Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
Experience in minor upgrades of Hadoop CDH4U2 to CDH4U7.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
Effectively used Sqoop to transfer data between databases and HDFS.
Extending the functionality of Hive and Pig with custom UDF’s and UDAF’s.
Fine tuning Hive jobs for better performance.
Involved in extracting the data from various sources into Hadoop HDFS for processing.
Worked on streaming the data into HDFS from web servers using flume.
Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify issues and behavioral patterns.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Responsible to manage data coming from different sources.
The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
Implemented UDFS, UDAFS, UDTFS in java for hive to process the data that can’t be performed using Hive inbuilt functions.
RegEx, JSON and Avro SerDe’s are being for serialization and de-serialization packaged with Hive to parse the contents of streamed log data and implemented Hive custom UDF’s.
Implemented PIG UDFS for filtering, loading and storing of data.

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, CLOUDERA MANAGER.

Confidential, Dallas, TX

System Admin/ Hadoop Admin

Responsibilities:

Responsible for building a cluster for storing 1.5PB Transactional data
Configured in operating system level includes resolving DNS Resolution, user accounts and file permissions, networking, SSH password less login.
Created LVM partitions on Linux Servers and mounted file systems on partitions.
Deployed a Hadoop cluster CDH3u4 and its ecosystem components.
Used Nagios to monitor the daemons and the cluster status, using custom monitoring scripts.
Import and export data from RDBMS (Oracle, MySQL) to HDFS using Sqoop.
Manage the day to day operations of the cluster for backup and support.
Performed operating system installation, Hadoop version updates using deployment tools like chef, puppet.
Implemented Kerberos on cluster for authenticating all the services.
Deployed NFS for Name Node Metadata backup.
Benchmarking mechanisms like TERASORT, TESTDFSIO.
Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
Worked on performing major upgrade from CDH3 to CDH4
Implemented Fair schedulers to share the resources of the cluster for the map.
Configured Ganglia including the daemons of GMOND and GMETAD which collects all the metrics running on the distributed cluster and visualize them in real-time dynamic webpages which would further help in debugging and maintenance.
Implemented Rack Topology on the Hadoop cluster.
Regular Commissioning and Decommissioning of nodes depending upon the data.
Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
Configured flume agents to stream log events into HDFS for analysis.
Configured Oozie for workflow automation and coordination.
Custom shell scripts for automating redundant tasks on the cluster.
Day-to- day - user access, permissions, Installing and Maintaining Linux Servers
Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot
Monitoring the System activity, Performance, Resource utilization.
Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administrationtasks like cron jobs, installing packages, and patches.
Extensive use of LVM, creating Volume Groups, Logical volumes.
Performed RPM and YUM package installations, patch and other server management.
Configured Domain Name System (DNS) for hostname to IP resolution
Troubleshooting and fixing the issues at User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hours

Environment: LINUX, HDFS, MAPREDUCE, KDC, NAGIOS, GANGLIA, OOZIE, SQOOP, CLOUDERA MANAGER

Confidential

Linux System Engineer

Responsibilities:

Installation and configuration of Linux for new build environment.
Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
Configuring NFS, DNS.
Updating YUM Repository and Red hat Package Manager (RPM).
Created RPM packages using RPMBUILD, verifying the new build packages and distributing the package.
Configuring distributed file systems and administering NFS server and NFS clients and editing auto-mounting mapping as per system / user requirements.
Installation, configuration and maintenance FTP servers, NFS, RPM and Samba.
Configured SAMBA to get access of Linux shared resources from Windows.
Created volume groups logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
Deep understanding of monitoring and troubleshooting mission critical Linux machines.
Experience with Linux internals, virtual machines, and open source tools/platforms.
Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
Ensured data recoverability by implementing system and application level backups.
Performed various configurations which include networking and IPTables, resolving hostnames, SSH key less login.
Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
Automate administration tasks through use of scripting and Job Scheduling using CRON.

ENVIRONMENT: LINUX, CITRIX XEN SERVER 5.0

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Atlanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship