Hadoop Administrator Resume
Ada, MI
SUMMARY
- 7+ years of professional IT experience in Hadoop Administration using Apache, Cloudera (CDH) and extensive years of experience in Linux Administration.
- About 3+ years experience in Hadoop Administration, planning, installing and managing the clusters.
- Deploying the Cloudera Hadoop cluster using Cloudera Manager.
- Experience in performing minor and major upgrades.
- Good Experience on Multi Clustered environment and setting up Hadoop Distribution using underlying technologies like map/reduce framework, PIG, Oozie, HBase, HIVE, Sqoop, Spark & related Java APIs for capturing, storing, managing, integrating & analyzing of data.
- Experience in functionalities of every Hadoop daemons, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
- Good knowledge on Hadoop cluster architecture and monitoring the cluster like upgradations and troubleshooting the cluster issues.
- In - depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Mapreduce, Yarn, Zookeeper concepts and very good understanding of High Availability (HA ), and edit logs.
- Intermediate knowledge in puppet automation tool.
- Experienced in defining job flows with Oozie to schedule the jobs.
- Knowledge on NoSQL databases such as HBase and experience in understanding and managing Hadoop Log Files.
- Strong Knowledge in Configuring and maintaining YARN Schedulers ( Fair, and Capacity).
- Experience in configuring Hadoop Security using Kerberos Authentication with Cloudera Manager.
- Knowldege on installing the necessary procedures and security tools as an linux administrator.
- "Performing Linux systems administration on production and development servers.
- Experience in UNIX Shell / AWK/ scripting for automation, alerts and backups.
- Experienced in Linux Administration tasks like IP Management
- Good communication and interpersonal skills to develop and maintain relationships with business partners and team members, coordinate schedules to meet project need.
TECHNICAL SKILLS
Operating Systems: Redhat Linux 6.0, Windows.
Big Data: Hive, Pig, Hbase, sqoop, mahout, Hadoop components (JT, TT, ZK)
Hadoop Distributor: Having knowledge on Horton works HDP, Cloudera Manager.
Database: MYSQL & NoSQL - Cassandra, Teradata SQL Assistance, SQL server, MangoDB,Oracle.
Scripting Languages: Unix ; Shell scripting.
Database Languages: T-SQL, PL/SQL.
PROFESSIONAL EXPERIENCE
Confidential, Ada, MI
Hadoop Administrator
Responsibilities:
- Deployed 100+ node clusters and currently managing 6 clusters like develop, test and production servers.
- Installation & configuration of Commodity hardware to setup Hadoop Cluster with CentOS operating system, single Linux Psuedo-distributed mode cluster on CentOS-6.3/6.4.
- Hadoop Cluster Deployment & Monitoring by Cloudera-Manager (Version 4.x & 5.x) and Ganglia monitoring.
- Migration of Hadoop cluster from mapredduce1 to Yarn (mapreduce2).
- Configuring, managing & tuning of HDFS with MapReduce (v1)/Yarn (v2) architecture with High- Availability.
- Configuration, Administration & Monitoring of HBase, Hive, Oozie, Hue, Tez, spark components of Hadoop Ecosystem and HDFS with MapReduce (MR1,Yarn).
- Configuring HDFS Name node high availability for Hadoop clusters and name node recovery for failures.
- Configuring, maintaining and troubleshooting the different size of cluster with Kerberos security in Data Center for NextGen Solutions.
- Implemented rack awareness on clusters.
- Monitoring the Servers and troubleshooting the problems within resolution required period.
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest pharmacy data into HDFS for analysis.
- Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
- Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
- Created Hive External tables and loaded the data in to tables and query data using HQL.
- Wrote shell scripts for rolling day-to-day processes and it is automated.
- Automated workflows using shell scripts pull data from various databases into Hadoop
- The coordinated root causes analysis efforts to minimize future system issues.
- Experience in performing backup and recovery of NameNode metadata and data.
- Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Helped team in scheduling jobs and high availability of clusters.
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- Monitored the CPU usage, IO statistics, Memory usage to identify any OS level performance killers, enabling the database level trace for the performance test.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Imported logs from web servers with Flume to ingest the data into HDFS
- Set up and manage High Availability name node and Namenode federation using Apache 2.0 to avoid a single point of failures in large clusters.
Environment: Cloudera CDH 4/5, Dell R720/R710 Rack servers, Map Reduce, Yarn, Spark, HDFS, Pig, Hive, HBase, Sqoop, Flume, Zookeeper and Oozie.
Confidential, Philidelphia, PA
Hadoop Administrator
Responsibilities:
- Working with Big data developers and teams in troubleshooting map reduce job failures and issues with Hive, Pig, Hbase.
- Installing Hadoop Eco-system Components (Pig, Hive )
- Closely working with Developers to provide the support.
- Managed the backup and disaster recovery for Hadoop data.
- Worked in performing minor and major upgrades, commissioning and decommissioning of nodes on a Hadoop cluster.
- Deployed Hadoop Cluster in Fully Distributed and Pseudo-distributed modes.
- Troubleshoot and provide the prompt support users to resolve their issues.
- Commission or decommission the datanodes from cluster in case of problems.
- Planning of disk space & capacity of the cluster.
- Used to provide assistance when maintenances are done to the clusters by stopping and starting all services
- Close monitoring and analysis of the MapReduce job executions on cluster at task level
- We worked on creating Shell scripts to provide faster installation/upgrade of the Hadoop environment.
- Worked on analyzing Data with HIVE and PIG
- Designed and allocated HDFS quotas for multiple groups
- Helped in setting up Rack topology in the cluster.
- Part of managing the Hadoop clusters: setup, install, monitor, maintain.
Environment: Cloudera CDH 4,, Map Reduce,, HDFS, Pig, Hive, sqoop,oozie,Ganglia nagios.
Confidential, Jacksonville, FL
Linux Administrator
Responsibilities:
- Responsible for providing the desktop system administration and support to the network.
- Ensure the compatibility of the hardware and software of the system and system issues.
- Monitored the file systems and CPU load for better performance
- Analyze performance of the system and ensure the performance objective and availability of the requirements
- Performance & Process Management with using Linux base commands.
- Installation and configuration of Linux for new build environment.
- Troubleshooting and solving problems related to users, applications, hardware etc.
- User management, Creating and managing user account, groups and access levels.
- Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, and user and group quota.
- Tech and non-tech refresh of Linux servers, which includes new hardware, OS, upgrade, application installation, testing.
- Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
- Administration, package installation, configuration of Oracle Enterprise Linux 5.x.
Confidential
Linux Administrator
Responsibilities:
- Creating, cloning Linux Virtual Machines. Installing RedHat Linux using kick start and applying security polices for hardening the server based on the company policies.
- Administration of RHEL, which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
- RPM and YUM package installations, patch and other server management.
- Creating physical volumes, volume groups, logical volumes.
- Installing and configuring Apache and supporting them on Linux production servers.
- Working 24/7 on call for application and system support
- Creating, Mount & Un mount the File Systems.
- Creating file systems, disk partitioning and troubleshooting
- Installing and updating packages using YUM.
- Analyze performance of the system and ensure the performance objective and availability of the requirements
- Gathering requirements from customers and business partners and design, implement and provide solutions in building the environment.