Hadoop Administrator Resume
Atlanta, GeorgiA
PROFESSIONAL SUMMARY:
- Having 9 years of IT experience including 4 years of experience with Hadoop Ecosystem in installation and configuration of different Hadoop eco - system components in the existing cluster and 6 years of experience on IBM Tivoli storage manager administration on Linux/AIX/Wndows platforms.
- Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, Spark, SQOOP, OOZIE, FLUME, RANGER,KNOX, HBASE, ZOOKEEPER) using Apache Ambari.
- Experience on Horton works and Cloudera manager Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
- Experience in improving the Hadoop cluster performance by considering the OS kernel, Storage, Networking, Hadoop HDFS and Map-Reduce by setting appropriate configuration parameters.
- Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Ambari.
- Experience in upgrading Hadoop cluster from current version to minor version upgrade as well as to major versions.
- Experience in PODIUM to create connections from up streams to Hadoop cluster.
- Experience in using Zookeeper for coordinating the distributed applications.
- Experience in managing Hadoop infrastructure like commissioning, decommissioning, log rotation, rack topology implementation.
- Experience in managing the cluster resources by implementing fair and capacity scheduler.
- Experience in scheduling jobs using OOZIE workflow.
- Scheduling jobs using crontab.
- Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
- Strong knowledge in configuring Name Node High Availability.
- Experience in configuring Hadoop Security (Ranger and Knox gateway).
- Experience in handling multiple relational databases: MySQL, SQL Server.
- Assisted Developers with problem resolution.
- Ability to play a key role in the team and communicates across the team.
- Global Service Delivery experience by bringing together resources to accomplish organizational goals using ITIL framework.
- Effective problem solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines. Ability to learn and use new technologies quickly.
- Worked on setting up Name Node high availability for major production cluster and designed Automatic failover control using zookeeper and quorum journal nodes. Authorized to work in the US for any employer
- Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Ambari
- Experienced in Linux Administration and TSM Administration
TECHNICAL SKILLS:
Hadoop Ecosystems: HDFS, Hive, Sqoop, Spark, Splunk, Zookeeper, HBase, Oozie, Kerberos, Ranger,Druid,Hbase,Hive.
Operating System: Windows, Linux, AIX, Ubuntu, AWS
RDBMS: MYSQL, Oracle, DB2, MSSQL SERVER,TERADATA.
Languages: C, C++, Java, shell scripting, Python, SQL
Other Tools: PODIUM,SVN,1Automation,Service Now, VPN, Win SCP, Putty,Edit++ and Notepad++.
PROFESSIONAL EXPERIENCE:
Confidential, Atlanta, Georgia
Hadoop Administrator
Responsibilities:
- Troubleshooting failed Hadoop jobs.
- Worked on Performance Tuning i.e HBase and HIVE.
- Resolving User's Zeppelin access issues.
- Assisting developers with DDL deployments.
- Monitoring HBase and HIVE jobs.
- Worked on JDK upgradation from 1.7 to 1.8.
- Worked on Ambari upgradation from V2.5.0.3 to 2.6.2.2.
- Worked on HDP upgradation from V2.6.0 to 2.6.5.
- Worked on Phonex Hbase performance issues.
- Configured on DRUID in stage environment.
- Created Hadoop secondary indexes as part of Hbase performance tuning.
- Performed Hadoop Hbase Region Merges.
- Copied Hadoop Data form GCP to HDFS.
- Providing permissions to developers for services like HIVE&HBASE.
- Performed various Ambari and HDP upgrades.
- Worked on different Hadoop eco system tools like HDFS, YARN, Hive, Spark and HBase.
- Improved HBase performance and Yarn performance by fine tuning parameters.
- Extremely good knowledge and experience with Map Reduce, Spark Streaming, SparkSQL for data processing.
- Optimized Map Reduce code by writing Pig Latin scripts.
- Import data from external table into HIVE by using scripts.
- Created table in hive and use static, dynamic partition for data slicing mechanism
- Working experience with monitoring cluster, identifying risks, establishing good practices to be followed in shared environment
- Good understanding on cluster configurations and resource management using YARN.
- Worked on tuning the performance of MapReduce Jobs.
- Responsible to manage data coming from different sources.
- Load and transform large sets of structured, semi structured and unstructured data
- Experience in managing and reviewing Hadoop log files.
- Job management using Fair scheduler.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Confidential, Charlotte, NC
Hadoop Administrator
Responsibilities:
- Participating in code deployments and coordinating with developers.
- Creating connection by using PODIUM in Between source (SFTP,JDBC) and Hadoop.
- Worked on Hadoop cluster maintenance activities like Hive restart.
- Working on Creating HIVE Tables and VIEWS on Hadoop Cluster.
- Working on 1Automation to Schedule and monitor Hadoop Jobs.
- Creation of SVN Tag from Hadoop Development to QA and Hadoop Production.
- Troubleshooting issues with jobs and coordinating dev team to fix them.
- Participating on call with Onshore to update the status of jobs and cluster.
- Worked on setting up security like Kerberos, Ranger and Knox.
- Arranged the required Nodes for building the cluster.
- Fine-tuned the CentOS operating system to accommodate installation Hadoop distribution.
- Planning of nodes for setting up various daemons.
- Configured Zookeeper to implement node coordination, in clustering support.
- Rebalancing the Hadoop Cluster.
- Monitoring Hadoop cluster from AMBARI.
- Working on Code Deployment from From Hadoop Production to Teradata UAT Environment.
- Commission of Hadoop New Nodes.
- Monitoring Hadoop Logs for Job Failures.
- Working on Hive performance tuning.
- Involved in testing HDFS, Hive, Pig and Map Reduce access for the new users.
- Monitoring Hadoop Ambari cluster and jobs.
Confidential, Chicago, IL
Hadoop Administrator
Responsibilities:
- Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
- Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
- Addressing and Troubleshooting issues on a daily basis.
- File system management and monitoring
- Provided Hadoop, OS, Hardware optimizations.
- Worked with popular Hadoop distribution like Hortonworks.
- Worked independently with Hortonworks support for any issue/concerns with Hadoop cluster.
- Implementing Hadoop Security on Hortonworks Cluster.
- Installed and configured Hadoop ecosystem components like Map Reduce, Hive, Pig, Sqoop, HBase, Zookeeper and Oozie.
- Creating snapshots and restoring snapshots.
- Good experience in troubleshoot production level issues in the cluster and its functionality.
- Backed up data on regular basis to a remote cluster using DistCp.
- Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
- Maintaining Cluster in order to remain healthy and in optimal working condition.
- Handle the upgrades and Patch updates.
Confidential, Charlotte, NC
Hadoop Administrator
Responsibilities:
- Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
- Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
- Addressing and Troubleshooting issues on a daily basis.
- File system management and monitoring.
- Installed and configured Hadoop ecosystem components like Map Reduce, Hive, Pig, Sqoop, HBase, Zookeeper and Oozie.
- Involved in testing HDFS, Hive, Pig and Map Reduce access for the new users.
- Cluster maintenance as well as creation and removal of nodes using Apache Ambari
- Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
- Implemented capacity scheduler to allocate fair amount of resources to small jobs.
- Performed operating system installation, Hadoop version updates using automation tools.
- Configured Oozie for workflow automation and coordination.
- Implemented rack aware topology on the Hadoop cluster.
- Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop.
- Configured Zookeeper to implement node coordination, in clustering support.
- Rebalancing the Hadoop Cluster.
- Allocating the name and space Quotas to the users in case of space problems.
- Installed and configured Hadoop security tools Knox, Ranger and enabled Kerberos
- Managing cluster performance issues.
Environment: Hortonworks (HDP 2.5), Ambari 2.4, HDFS, Java, Shell Scripting, Splunk, Python, Hive, Spark, Sqoop, Linux, SQL, Cloudera, Zookeeper, AWS, HBase, Oozie, Kerberos, Ranger
Confidential
Hadoop Administrator
Responsibilities:
- Handle the installation and configuration of a Hadoop cluster.
- Build and maintain scalable data using the Hadoop ecosystem and other open source components like Hive and HBase.
- Monitor the data streaming between web sources and HDFS.
- Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
- Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
- Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
- Setting up Identity, Authentication and Authorization.
- Maintaining Cluster in order to remain healthy and in optimal working condition.
- Handle the upgrades and Patch updates.
- Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
- Inputs to development regarding the efficient utilization of resources like memory and CPU utilization.
- Based on the running statistics of Map and Reduce tasks.
- Balancing HDFS manually to decrease network utilization and increase job performance.
- Commission and decommission the Data nodes from cluster in case of problems.
- Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
- Discussions with other technical teams on regular basis regarding upgrades, Process changes, any special processing and feedback.
Environment: Horton works (HDP 2.2), HDFS, Hive, Spark, Sqoop, SQL, Splunk, Linux, Zookeeper, HBase, Oozie.
Confidential
Tivoli Administrator
Responsibilities:
- Support customer accounts on Backup & Storage technologies.
- Planning TSM backups with required retention periods and defining the policy domains and management class accordingly, binding of client’s data to required management class to store the data in predefined storage pools like Disk, sequential storage pools and copy data from primary storage pools to copy storage pool for offsite.
- Configuring TSM operations like expiration, migration, reclamation, collocation and media management.
- Define and configure the client and administrative schedules, checking the status.
- Checking error reports on the servers and health check of the servers, if any recovery LOG or Database or Storage pool related issues, troubleshooting based on the criticality.
- Ensure that all Backup server, tape library hardware and software are maintained to current levels, including system firmware code and that all critical hardware and corresponding appropriate software is placed on service/maintenance contracts.
- Documentation of infrastructure, software, systems configuration, process and policies.
- Detect/diagnose and resolve hardware issues (server, tape library, etc) and interface with vendor, manufacturer for H/W and S/W as necessary.
- Creating daily / Monthly reports on backup status for all customer accounts,
- Working on EMC Avamar like installation, backup, restore and configuring the policy and schedules.
- Working on Data Domain 990, creating and managing NFS and CIFS shares for backup and troubleshooting the issues.
- Working on DD OS code upgrade as per the suggestion from Vendor.
- Basic knowledge on EMC SAN like creating New File systems and exporting NFS and CIFS shares and providing required access.
- Basic knowledge on Symantec Net Backup appliance like configuring backup, creating new policies and working on master and media server issues.
- Monitoring the Hadoop Clusters.
- Worked on Hadoop Basic Failures.
- Supported Architecture team in building of Hadoop Clusters.
- Worked on BMC Remedy for Hadoop Failures.
Confidential
Tivoli Administrator
Responsibilities:
- Planning TSM backups with required retention periods and defining the policy domains and management class accordingly, binding of client’s data to required management class to store the data in predefined storage pools like Disk, sequential storage pools and copy data from primary storage pools to copy storage pool for offsite.
- Configuring TSM operations like expiration, migration, reclamation, collocation and media management.
- Define and configure the client and administrative schedules, checking the status.
- Checking error reports on the servers and health check of the servers, if any recovery LOG or Database or Storage pool related issues, troubleshooting based on the criticality.
- Ensure that all Backup server, tape library hardware and software are maintained to current levels, including system firmware code and that all critical hardware and corresponding appropriate software is placed on service/maintenance contracts.
- Documentation of infrastructure, software, systems configuration, process and policies.
- Detect/diagnose and resolve hardware issues (server, tape library, etc) and interface with vendor, manufacturer for H/W and S/W as necessary.
- Working on EMC Avamar like installation, backup, restore and configuring the policy and schedules.
- Working on Data Domain 990, creating and managing NFS and CIFS shares for backup and troubleshooting the issues.
- Working on DD OS code upgrade as per the suggestion from Vendor.
- Basic knowledge on EMC SAN like creating New File systems and exporting NFS and CIFS shares and providing required access.
- Basic knowledge on Symantec Net Backup appliance like configuring backup, creating new policies and working on master and media server issues.
- Linux technical support and prepared technical documentation for decommissioning.
- Regular backing up of critical data and restoring backed up data & Server’s verification.
- Responsible for day-day administration of Linux System and middleware. Administration of RHEL 5, 6, CentOS with work related to install, test, tune, upgrade, troubleshoots server’s issues and load patches.
- Performed data and activities inside the data warehouse.
- Responsible for configuring HP Insight Control tools featuring HP Systems Insight Manager and HP Rapid Deployment tools.
- Configuring library and library paths, drives and drive paths, device class.
- Audit Library if any mismatch in LIBRARY inventory and Backup inventory.
- Health checking of File systems.
- Maintaining minimum amount of File system Space.
- Coordinating with Backup & Storage team for any Backup Failures.
Confidential
Tivoli Administrator
Responsibilities:
- Performing health status checks on TSM server and rectifying errors on the same.
- Extending TSM database and storage pools whenever required.
- Creating/modifying policy domains, storage pools and management class on request.
- Creating new nodes and associating them to backup schedule.
- Backup status checking and troubleshooting client backup failures.
- Installed and Configured Red Hat Linux servers.
- Worked on more than 600 Linux servers to install and configure applications.
- Decommissioned around 300 Red Hat Linux servers.
- Actively engaged in power maintenance and network maintenance calls where I am responsible to fix issues on Red Hat Linux Servers and Solaris Servers.
- V2V'Ed Red Hat Linux Servers from old IP addresses to new IP addresses using VMware.
- Installation, configuration, and Operating System upgrade on, Red Hat Linux 5, 6.
- Installed and Configured the Web and investigate the configuration changes in the production environment.
- Launching Amazon EC2 Cloud Instances using Amazon Images (Linux/ Ubuntu) and Configuring launched instances with respect to specific applications.
- Experience in creation of environments on virtual machines to be handed over to development and QA teams
- Restoring user data as per request using general restore and point in time restore.