We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

0/5 (Submit Your Rating)

Atlanta, GA

SUMMARY

  • A qualified Technocrat and a seasoned professional offering 7+ years of IT experience in Administration and Product Development.
  • 2.5 years of experience in big data technologies: Hadoop HDFS, Hive, Oozie, Flume, Hcatalog, Sqoop, Zookeeper, NoSQL: Cassandra and Hbase.
  • Experience with complete Software Design Lifecycle including design, development, testing and implementation of moderate to advanced complex systems.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloud era, HortonWorks, MapR and Pivotal Distributions.
  • Backup configuration and Recovery from a Name Node failure.
  • Decommissioning and commissioning the Node on running Hadoop cluster.
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Installation and configuration of Sqoop and Flume.
  • Experience in deploying and managing the multi - node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME,HCATALOG, ZOOKEEPER) using Cloudera Manager.
  • Experience in understanding the security requirements for Hadoop and integrating with
  • Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing principles, generating key tab file for each and every service and managing key tab using key tab tools
  • Worked on setting up Name Node high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes
  • Experience in Importing and exporting data from different databases like MySQL into HDFS and Hive using Sqoop.
  • Worked on streaming the data into HDFS from web servers using flume.
  • Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
  • Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
  • Scheduling all hadoop/hive/sqoop/Hbase jobs using Oozie.
  • Rack aware configuration for quick availability and processing of data.
  • Handsome experience in Linux admin activities.

TECHNICAL SKILLS

Hadoop Ecosystem: HDFS, Mapreduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, Zoo Keeper, Cloudera Manager.

Security: Kerberos

Programming Languages: Java, C#, C, SQL, Java Script, HTML

Scripting Languages: Shell Scripting, Puppet

IDE Tools: Eclipse, NetBeans, Visual Studio, Microsoft SQL Server, MS Office

Monitoring Tools: Nagios, Ganglia, Cloudera Manager.

Operating Systems: Linux RHEL/Ubuntu/CentOS, Windows (XP/7/8)

Virtualization technologies: VMware vSphere, Citrix XenServer

PROFESSIONAL EXPERIENCE

Confidential - Atlanta, GA

Hadoop Administrator

Responsibilities:

  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Installation and configuration of Sqoop and Flume, Hbase
  • Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • As a admin followed standard Back up policies to make sure the high availability of cluster.
  • Decommissioning and commissioning the Node on running Hadoop cluster.
  • Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Experience in minor upgrades of Hadoop CDH4U2 to CDH4U7.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
  • Effectively used Sqoop to transfer data between databases and HDFS.
  • Extending the functionality of Hive and Pig with custom UDF’s and UDAF’s.
  • Fine tuning Hive jobs for better performance.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Worked on streaming the data into HDFS from web servers using flume.
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify issues and behavioral patterns.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from different sources.
  • The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Implemented UDFS, UDAFS, UDTFS in java for hive to process the data that can’t be performed using Hive inbuilt functions.
  • RegEx, JSON and Avro SerDe’s are being for serialization and de-serialization packaged with Hive to parse the contents of streamed log data and implemented Hive custom UDF’s.
  • Implemented PIG UDFS for filtering, loading and storing of data.

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, CLOUDERA MANAGER.

Confidential, Dallas, TX

System Admin/ Hadoop Admin

Responsibilities:

  • Responsible for building a cluster for storing 1.5PB Transactional data
  • Configured in operating system level includes resolving DNS Resolution, user accounts and file permissions, networking, SSH password less login.
  • Created LVM partitions on Linux Servers and mounted file systems on partitions.
  • Deployed a Hadoop cluster CDH3u4 and its ecosystem components.
  • Used Nagios to monitor the daemons and the cluster status, using custom monitoring scripts.
  • Import and export data from RDBMS (Oracle, MySQL) to HDFS using Sqoop.
  • Manage the day to day operations of the cluster for backup and support.
  • Performed operating system installation, Hadoop version updates using deployment tools like chef, puppet.
  • Implemented Kerberos on cluster for authenticating all the services.
  • Deployed NFS for Name Node Metadata backup.
  • Benchmarking mechanisms like TERASORT, TESTDFSIO.
  • Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
  • Worked on performing major upgrade from CDH3 to CDH4
  • Implemented Fair schedulers to share the resources of the cluster for the map.
  • Configured Ganglia including the daemons of GMOND and GMETAD which collects all the metrics running on the distributed cluster and visualize them in real-time dynamic webpages which would further help in debugging and maintenance.
  • Implemented Rack Topology on the Hadoop cluster.
  • Regular Commissioning and Decommissioning of nodes depending upon the data.
  • Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
  • Configured flume agents to stream log events into HDFS for analysis.
  • Configured Oozie for workflow automation and coordination.
  • Custom shell scripts for automating redundant tasks on the cluster.
  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot
  • Monitoring the System activity, Performance, Resource utilization.
  • Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administrationtasks like cron jobs, installing packages, and patches.
  • Extensive use of LVM, creating Volume Groups, Logical volumes.
  • Performed RPM and YUM package installations, patch and other server management.
  • Configured Domain Name System (DNS) for hostname to IP resolution
  • Troubleshooting and fixing the issues at User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hours

Environment: LINUX, HDFS, MAPREDUCE, KDC, NAGIOS, GANGLIA, OOZIE, SQOOP, CLOUDERA MANAGER

Confidential

Linux System Engineer

Responsibilities:

  • Installation and configuration of Linux for new build environment.
  • Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
  • Configuring NFS, DNS.
  • Updating YUM Repository and Red hat Package Manager (RPM).
  • Created RPM packages using RPMBUILD, verifying the new build packages and distributing the package.
  • Configuring distributed file systems and administering NFS server and NFS clients and editing auto-mounting mapping as per system / user requirements.
  • Installation, configuration and maintenance FTP servers, NFS, RPM and Samba.
  • Configured SAMBA to get access of Linux shared resources from Windows.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Experience with Linux internals, virtual machines, and open source tools/platforms.
  • Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
  • Ensured data recoverability by implementing system and application level backups.
  • Performed various configurations which include networking and IPTables, resolving hostnames, SSH key less login.
  • Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
  • Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
  • Automate administration tasks through use of scripting and Job Scheduling using CRON.

ENVIRONMENT: LINUX, CITRIX XEN SERVER 5.0

We'd love your feedback!