Hadoop Administrator Resume McLean, VA - Hire IT People

SUMMARY

Having 8+ plus years of IT industrial experience in Administering Linux, Database management, developing Map - reduce applications, designing, building and administering large scale Hadoop production Clusters
3 years of experience in big data technologies: Hadoop HDFS, Map-reduce, Tez, Pig,Yarn, Hive, Spark, Oozie, Flume, kafka, Sqoop, Zookeeper, And NoSQL: Cassandra and Hbase.
Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, Spark, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Hortonworks Ambari.
Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Ambari Metrics.
Experience in benchmarking, performing backup and disaster recovery of Namenode metadata and important sensitive data residing on cluster.
Experience in performing minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
Strong knowledge in configuring Namenode High Availability and Namenode Federation.
Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.
Experience in using Flume to stream data into HDFS - from various sources.
Hands on experience in provisioning and managing multi-tenant Hadoop clusters on public cloud environment - Amazon Web Services (AWS)-EC2.
Experience in installing and administering PXE Server with kick start, setting up FTP, DHCP, DNS servers and Logical Volume Management.
Experience in configuring and managing storage devices NAS (file level access - NFS) and SAN (block level access-iSCSI)
Experience in Storage management including JBOD, RAID Levels 1 5 6 10, Logical Volumes, Volume Groups and Partitioning
Exposure to Maven/Ant, GIT along with Shell Scripting for Build & Deployment Process.
Experience in maintain and distribute the configuration files using PUPPET
Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principles, generation key tab file each service and managing key tab using key tab tools.
Experience in handling multiple relational databases: MySQL, SQL Server.
Familiar with Agile Methodology (SCRUM) and Software Testing.
Effective problem-solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines. Ability to learn and use new technologies quickly.

TECHNICAL SKILLS

Hadoop Ecosystem: HDFS, SQOOP, FLUME, MAP-REDUCE, HIVE, PIG, OOZIEZOOKEEPER

NoSQL Database: Hbase, Cassandra

Security: Kerberos

Database: MySQL, SQL Server

Cluster management Tools: Cloudera Manager, Ambari

OS: LINUX (Centos, RHEL), windows, mac

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Administrator

Responsibilities:

Experienced in Build and maintain the >14PB Production environment.
Installed and maintained 300+ node Hadoop clusters using Hortonworks Hadoop (HDP 2.2, HDP 2.3, HDP 2.5 and HDP 2.6)
Monitored workload, job performance and capacity planning using Ambari.
Performed both major and minor upgrades to the existing Ambari Hadoop cluster.
Integrated Hadoop with Active Directory and enabled Kerberos, Knox for Authentication.
Applied patches and bug fixes on Hadoop Clusters.
Performance tuned and optimized Hadoop clusters to achieve high performance.
Implemented capacity scheduler on the yarn to share the resources of the cluster for the map reduces/tez jobs given by the users.
Monitoring Hadoop Clusters using Ambari UI
Expertise in implementation and designing of disaster recovery plan for Hadoop Cluster.
Extensive hands on experience in Hadoop file system commands for file handling operations.
Worked on Integration of Hiveserver2 with Tableau AND Informatica.
Worked on Providing User support and application support on Hadoop Infrastructure.
Involved in business requirements gathering and analysis of business use cases.
Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP.
Add new hosts and Balanced the data on the new data nodes by running the HDFS Balancer.
Running shell scripts through cron job to monitor and alert admins about the bad jobs.
Add non ambari managed ETL hosts to the hadoop environment and deploy the configs using PUPPET.
Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
Imported logs from web servers with Flume to ingest the data into HDFS.
Experienced in deploying and maintaining the ambari views.
Experienced in fine tune Hbase service and jobs fine tuening.
Fine tuning hive jobs for optimized performance
Fine tuning Hive jobs for better performance.
Involved in extracting the data from various sources into Hadoop HDFS for processing.
Fine tuning the Spark jobs for better utilization of the resources.
Effectively used Sqoop to transfer data between databases and HDFS.
The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.

Environment: HDFS, SQOOP, FLUME, MAP-REDUCE, HIVE, PIG, OOZIE, ZOOKEEPER, HBASE, Manager

Confidential, McLean, VA

Hadoop Administrator

Responsibilities:

Installed and maintained 100+ node Hadoop clusters using HDP
Performed both major and minor upgrades to the existing Ambari Hadoop cluster.
Applied patches and bug fixes on Hadoop Clusters.
Performance tuned and optimized Hadoop clusters to achieve high performance.
Monitoring Hadoop Clusters using Ambari UI
Expertise in implementation and designing of disaster recovery plan for Hadoop Cluster.
Extensive hands on experience in Hadoop file system commands for file handling operations.
Involved in business requirements gathering and analysis of business use cases.
Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MR, HIVE, SQOOP.
Add new hosts and Balanced the data on the new data nodes by running the HDFS Balancer.
Running shell scripts through cron job to monitor and alert admins about the bad jobs.
Worked with developers to fine tuning Hive jobs for better performance.
Involved in extracting the data from various sources into Hadoop HDFS for processing.
Effectively used Sqoop to transfer data between databases and HDFS.
The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.

Environment: Hive, HBase, Map Reduce, Oozie, Sqoop, MySQL, PL/SQL, Linux, HDP.

Confidential, San Jose, CA

Hadoop Administrator

Responsibilities:

Involved in start to end process of hadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster.
Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
Monitoring systems and services, architecture design and implementation of hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement
Importing and exporting data into HDFS using Sqoop.
Experienced in define being job flows with Oozie.
Loading log data directly into HDFS using Flume.
Experienced in managing and reviewing Hadoop log files.
Installation of various Hadoop Ecosystems and Hadoop Daemons.
Installation and configuration of Sqoop and Flume, Hbase
Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
As a admin followed standard Back up policies to make sure the high availability of cluster.
Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
Worked with systems engineering team to plan and deploy new hadoop environments and expand existing hadoop clusters.
Monitored multiple hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.

Environment: HDFS, Pig, Hive, HBase, Sqoop, Spark, Oozie, Sqoop, flume, Kafka, AWS, Linux Shell Scripting.

Confidential, Salt Lake City

Linux Admin/ Hadoop Admin

Responsibilities:

Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
Implemented authentication service using Kerberos authentication protocol.
Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
Master nodes disks are configured with RAID 1+0
Performed benchmarking on the Hadoop cluster using different benchmarking mechanisms.
Tuned the cluster by Commissioning and decommissioning the Data Nodes.
Upgraded the Hadoop cluster.
Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Deployed high availability on the Hadoop cluster quorum journal nodes.
Implemented automatic failover zookeeper and zookeeper failover controller.
Configured Ganglia which include installing GMOND and GMETAD daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
Implemented Kerberos for authenticating all the services in Hadoop Cluster.
Deployed Network file system for Name Node Metadata backup.
Performed cluster back using DISTCP, Cloudera manager BDR and parallel ingestion.
Designed and allocated HDFS quotas for multiple groups.
Configured and deployed hive metastore using MySQL and thrift server.
Used hive schema to create relations in pig using Hcatalog.
Development of Pig scripts for handling the raw data for analysis.
Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
Deployed and configured flume agents to stream log events into HDFS for analysis.
Performed deploying yarn, which facilitate multiple applications to run on the cluster.
Configured Oozie for workflow automation and coordination.
Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
Custom shell scripts for automating redundant tasks on the cluster.
Worked with BI teams in generating the reports and designing ETL workflows on Pentaho.

Environment: LINUX, HDFS, SQOOP, FLUME, MAP-REDUCE, HIVE, PIG, OOZIE, ZOOKEEPER

Confidential

Linux System Administrator

Responsibilities:

Day-to- day - user access, permissions, Installing and Maintaining Linux Servers
Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot
Created groups, added Users ID to a group as a primary or secondary group, removing Users ID from a group as well as adding users in Sudoers file
Monitoring the System activity, Performance, Resource utilization.
Use RPM to install, update, verify, query and erase packages from linux Servers
Extensive use of LVM, creating Volume Groups, Logical volumes.
Worked on mounting the file-systems using AutoFS and configuring fstab file
Performed RPM and YUM package installations, patch and other server management.
Performed scheduled backup and necessary restoration.
Configured NFS
Developed Shell Scripts for automation of daily tasks
Setting up cron schedules for backups and monitoring processes
Configured Domain Name System (DNS) for hostname to IP resolution
Troubleshooting and fixing the issues at User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hours

Environment: LINUX (Centos/RHEL)

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Mclean, VA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship