We provide IT Staff Augmentation Services!

Sr Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

Alpharetta, GA

SUMMARY:

  • Around 8+ years of Information Technology experience. Extensive experience in design Experience in Hadoop administration activities such as installation and configuration of clusters using Apache, Cloudera, Hortonworks, AWS ECS and ISILON Able to understand business and technical requirements quickly; Excellent communications skills and work ethics; Able to work independently; Experience working with clients of all sizes.
  • Over 5+ years of experience in Hadoop Administration.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop HDFS, Yarn, MapReduce, Spark, HBase, Oozie, Hive, Sqoop, Pig, Flume, SmartSense, Storm, Kafka, Ranger, Falcon and Knox.
  • Worked in Agile Development Methodology.
  • Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Cloudera, Hortonworks, Amazon AWS, ECS & ISILON.
  • Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in setting up the High - Availability Hadoop Clusters.
  • Ability to prepare documents including Technical Design, testing strategy, and supporting documents.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4), Yarn distributions (CDH 5.X).
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Hortonworks (HDP 2.2, HDP2.3).
  • Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
  • Good experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
  • Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
  • Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • As a admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper backup& Recovery strategies.
  • Experience in HDFS data storage and support for running map-reduce jobs.
  • Installing and configuring hadoop eco system like sqoop, pig, hive.
  • Knowledge on HBase and zookeeper.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
  • Knowledge on architecture and implementation of Confluent Kafka
  • Hands on experience on Nagios and Ganglia tool.
  • Scheduling all Hadoop/hive/sqoop/HBase jobs using Oozie.
  • Rack awareness configuration for quick availability and processing of data.
  • Handsome experience in Linux admin activities
  • Experience in configuration management tools chef.
  • Good understanding in Deployment of Hadoop Clusters Using Automated Puppet scripts
  • Experience in hardware recommendations, performance tuning and benchmarking
  • Experience in IP Management (IP Addressing, Sub-netting, Ethernet Bonding, Static IP)
  • Flexible with Unix/Linux and Windows Environments working with Operating Systems like Centos 5/6, Ubuntu 10/11 and Sun.
  • Experience in Linux Storage Management. Configuring RAID Levels, Logical Volumes.

TECHNICAL SKILLS:

Hadoop Framework: HDFS, Map Reduce, Pig, Hive, HBase, sqoop, zookeeper, Ranger, Storm, Kafka, Oozie, flume, Hue, Knox, Spark

Databases: Oracle 9i/10g, DB2, SQL Server, MYSQL

Cloud Environment: AWS, Azure and ISILON

Operating Systems: Linux RHEL/Ubuntu/CentOS, Windows (XP/7/8)

Scripting Languages: Shell scripting

Network Security: Kerberos

Monitoring Tools: Cloudera Manager, Ambari, Nagios, Ganglia, New Relic

Configuration management: Chef

Containers: Docker and Mesosphere

PROFESSIONAL EXPERIENCE:

Confidential - Alpharetta GA

Sr Hadoop Administrator

Responsibilities:

  • Managed 150+ Nodes CDH 5.8.2 cluster with 2 petabytes of data using CM 5.8.3 and Linux Cent OS 6.5.
  • Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
  • Conducting RCA to find out data issues and resolve production problems.
  • Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Spark
  • Worked on custom Pig Loaders and storage classes to work with variety of data formats such as JSON and XML file formats.
  • Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows. Also Wrote Pig scripts to run ETL jobs on the data in HDFS.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS
  • Building a data pipeline using XML parser and making the parsed xml data to Consumers.
  • Worked on migrating from hive actions to Spark Sql and using Data Frames.
  • Worked on Hive optimization techniques to improve the performance of long running jobs.
  • Identified several Bugs in CDH 5.8.2 and stood as first in finding those issues.
  • Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts n data.
  • Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations.
  • Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
  • Experience on JIRA and ServiceNow to track issues on the big data platform.
  • Experienced in managing and reviewing Hadoop log files
  • Worked with Sqoop in Importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
  • Experience on HBase High availability and manually tested using failover tests.
  • Create queues and allocated the clusters resources to provide the priority for jobs.
  • Experience in upgrading the cluster to newer versions of CDH 5.8.2 and CM 5.8.3
  • Working experience on maintaining MySQL databases creation and setting up the users and maintain the backup of cluster metadata databases with cron jobs.
  • Provided technical assistance for configuration, administration and monitoring of Hadoop clusters.
  • Coordinated with technical teams for installation of Hadoop and third related applications on systems.
  • Supported technical team members for automation, installation and configuration tasks.
  • Suggested improvement processes for all process automation scripts and tasks.
  • Assisted in designing, development and architecture of Hadoop and Hbase systems.
  • Formulated procedures for planning and execution of system upgrades for all existing Hadoop clusters.
  • Managing and reviewing Hadoop and HBase log files
  • Built and configured log data loading into HDFS using Flume.
  • Performed Importing and exporting data into HDFS and Hive using Sqoop.
  • Managed cluster coordination services through Zoo Keeper.
  • Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Pig, Hive, Falcon, Smartsense, Storm, Kafka and Spark
  • Recovering from node failures and troubleshooting common Hadoop cluster issues.
  • Scripting Hadoop package installation and configuration to support fully-automated deployments.
  • Installed Kafka cluster with separate nodes for brokers.
  • Performed Kafka operations on regular basis.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • System/cluster configuration and health check-up.
  • Continuous monitoring and managing the Hadoop cluster through Cloudera manager.
  • Created user accounts and given users the access to the Hadoop cluster.
  • Used Git to version control custom shell scripts.
  • Resolving tickets submitted by users, troubleshoot the error documenting, resolving the errors.
  • Performed HDFS cluster support and maintenance tasks like Adding and Removing Nodes without any effect to running jobs and data.
  • Experience in using Chef and Docker
  • Responsible for cluster Maintenance, Monitoring, Troubleshooting, Tuning, commissioning and Decommissioning of nodes.
  • Responsible for cluster availability and experienced on ON-call support
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Documented the systems processes and procedures for future references.

ENVIRONMENT: CDH 5.8.2, Hadoop 2.5.0, Map Reduce 2.0 (YARN) HDFS, Hive 0.13, Hue 3.7.0, Pig 0.14.0, HBase, Spark, Scala, Jenkins, Sonar, C, RDBMS, Oracle 11g/10g, Oozie, Java (jdk1.6), UNIX, GIT, Zookeeper, Gradle, Python, Tableau.

Confidential - Charlotte, NC

Hadoop Administrator

Responsibilities:

  • Involved in deploying a Hadoop cluster using Hortonworks Ambari HDP 2.2 integrated with Sitescope for monitoring and Alerting.
  • Launching and Setup of HADOOP Cluster on physical servers, which includes configuring different components of HADOOP.
  • Created a local YUM repository for installing and updating packages.
  • Responsible for building system that ingests terabytes of data per day into Hadoop from a variety of data sources providing high storage efficiency and optimized layout for analytics.
  • Developed data pipelines that ingests data from multiple data sources and process them.
  • Expertise in Using Sqoop to connect to the ORACLE, MySQL, SQL Server, TERADATA and move the pivoted data to Hive tables or HBase tables.
  • Implemented Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing principles, generating key tab file for each service and managing key tab using key tab tools.
  • Worked on SAS migration to Hadoop on Fraud Analytics and provided predictive analysis.
  • Developed multiple Map Reduce jobs in java for data cleansing and preprocessing.
  • Configured Kerberos for authentication, Knox for perimeter security and Ranger for granular access in the cluster.
  • Configured and installed several Hadoop clusters in both physical machines as well as the AWS cloud for POCs.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Extensively used Sqoop to move the data from relational databases to HDFS.
  • Used Flume to move the data from web logs onto HDFS.
  • Used Pig to apply transformations validations, cleaning and deduplication of data from raw data sources.
  • Integrated schedulers Tidal and Control-M with the Hadoop clusters to schedule the jobs and dependencies on the cluster.
  • Worked closely with the Continuous Integration team to setup tools like Github, Jenkins and Nexus for scheduling automatic deployments of new or existing code.
  • Actively monitored the Hadoop Cluster of 320Nodes with Hortonworks distribution with HDP 2.4.
  • Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
  • Worked on performing minor upgrade from HDP 2.2.2 to HDP 2.2.4
  • Upgraded the Hadoop cluster from HDP 2.2 to HDP 2.4 and HDP 2.4 to HDP 2.5
  • Integrated BI tool Tableau to run visualizations over the data.
  • Solving hardware related Issues Ticket assessment on daily basis.
  • Automate administration tasks using scripts and Job Scheduling using CRON.
  • Provided 24 x 7 on call support as part of a scheduled rotation with other team members

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, OOZIE, SQOOP, AMBARI, STORM, AWS S3, EC2, IDENTITY ACCESS MANGEMENT, ZOOKEEPER, NIFI

Confidential - Pittsburgh PA

Hadoop Administrator

Responsibilities:

  • Solid Understanding of Hadoop HDFS, Map-Reduce and other Eco-System Projects.
  • Installation and Configuration of Hadoop Cluster
  • Working with Cloudera Support Team to Fine Tune Cluster
  • Experienced in managing and reviewing Hadoop log files.
  • Created user accounts and given users the access to the Hadoop cluster.
  • Worked with application teams to install operating system and Hadoop updates, patches, version upgrades as required.
  • Advised file system team on optimizing IO for Hadoop / analytics workloads.
  • Importing the data from the MySQL and Oracle into the HDFS using Sqoop.
  • Importing the unstructured data into the HDFS using Flume.
  • Written Map Reduce java programs to analyze the log data for large-scale data sets.
  • Experienced in running Hadoop streaming jobs to process terabytes of XML format data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources.
  • Supported Map Reduce Programs those are running on the cluster.
  • Assisted with data capacity planning and node forecasting.
  • Upgraded the Hadoop cluster from CDH3 to CDH4.
  • Jobs management using Fair scheduler.
  • Cluster coordination services through Zoo Keeper.
  • Involved in loading data from UNIX file system to HDFS.
  • Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
  • Automate administration tasks using scripting and Job Scheduling using CRON.
  • Manage the day to day operations of the cluster for backup and support.
  • Creating and managing Logical volumes. Using Java JDBC to load data into MySQL.

Environment: Hadoop Hdfs, MapReduce, Hive, Pig, Flume, Oozie, Sqoop, spark, Cloudera Manager.

Confidential - Phoenix, AZ

Linux Administrator

Responsibilities:

  • Installation and configuration of Linux for new build environment.
  • Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
  • Installed Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Software installation, disk partitioning, file system creation, user id creation and configuration of Linux.
  • Configuring and Managing Yum Repository
  • Disk space Management, Disk quota management, Maintenances of password & shadow file, NIS master and client configuration, NFS file system configuration.
  • Working on Logical Volume Manager (LVM).
  • Installed and Configured 5 node Hadoop cluster
  • LVM configuration and increasing the size of Logical Volume and updating the file system
  • Restricted file and directory access permissions securely by set UID & GID. Set as per project requirements and data security.
  • IP tables configuration and Maintenance.
  • Performed various configurations which include networking and IP Tables, resolving hostnames, SSH key less login.
  • User & Group Management:
  • User Creation, Modification & Deletion as per requirements.
  • Group id creation, deletion, and addition of a group for a user.
  • Utilization of dump and restore for file system backup and restoration.
  • Log management using cron jobs
  • Automate administration tasks through use of scripting and Job Scheduling using CRON.
  • Performance tuning of MySQL engines like MYIASM and Innodb.
  • MySQL application using MySQL workbench, Toad for MySQL, MySQL Administrator.
  • Setting up MySQL Cluster on 2 node servers.
  • Performance tuning for high transaction and volumes data in mission critical environment.

Environment: MySQL, PHP, Shell Script, Apache, Linux.

Confidential

Technical Analyst

Responsibilities:

  • Experience in Provisioning/Installing Linux Operating system (CentOS, Redhat, and Ubuntu) on bare metal and on cloud.
  • Experienced in Linux network Administration tasks like IP Management (IP Addressing, Ethernet Bonding, Static IP)
  • Experience in Linux User management tasks like configuring users, groups, permissions and access control
  • Experience in Linux storage management like LVM, Partitioning, RAID 0/5/6/10
  • &NFS server. Experience in writing shell scripts and cron jobs for automation.
  • Identifying & troubleshooting the defects in Storage component of Windows. Administering and troubleshooting Shared drive access issues for users.
  • Optimizing & troubleshooting on regular basis for server management tasks like Disk Space
  • Monitoring (Root partition only), Disk Defragmentation & Event Log Monitoring, File & Share
  • Permissions / Print Management.
  • Creation of inventory using ARIS tool.
  • Providing Active Directory support which Includes: Transferring FSMO roles to ADC’s during ServerPatching. Troubleshooting Replication Issues.
  • Group Policy Maintenance and Updates Establishing Trust between external domains and fixing Trust Related issues. Managing and Maintaining Groups and User Accounts.
  • Restoring Deleted objects in Active Directory using LDP.exe
  • Managing and Maintaining DNS, DHCP Servers which Includes Scope Creation, Modification, Reservations, Backup and restore DHCP Scopes, Creating External and Internal DNS records, Creating Zones.
  • Extending Certificates and Issuing new certificates to all Servers from root CA.
  • Administration of Citrix Presentation server 4.0, 4.5 on Windows 2000 & 2003 platforms Building Citrix servers and adding them in to the Farms.
  • Performing tasks for publishing the Customer Applications on Citrix farm servers.
  • Managing production server’s accordance to Customer standards.
  • Troubleshooting to resolve issues which arias in Citrix infra Servers and from end users Hands on experience in different stages of Remote Support (Knowledge Transfer,

Environment: AD, Citrix, RAID 0/5/6/10,

We'd love your feedback!