Sr. Hadoop Administrator Resume
San Diego, CA
SUMMARY
- A qualified Technocrat and a seasoned professional offering 8+ years of IT experience in Administration - Linux and Hadoop.
- Strong knowledge on Hadoop HDFS architecture, YARN and Map-Reduce framework.
- Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, ZOOKEEPER) using Cloudera Manager and Hortonworks Ambari.
- Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing principals, generating key tab file for each and every service and managing key tab using key tab tools.
- Responsible for building a cluster with Pivotal Hadoop Distribution PHD 3.0 with Ambari.
- Experience in creating VM’s using BDE (Big Data Extension).
- Integrated ISILON with Hadoop cluster to store HDFS data.
- Installed and configured HAWQ in the Hadoop cluster.
- Experience in managing HAWQ segments and recovering them if they are failed.
- Imported data from GREENPLUM into HDFS using SPRING-XD.
- Implemented TEZ execution engine for hive jobs.
- Integrated ALPINE server to the Hadoop cluster for the data modelling.
- Performed minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
- Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
- Familiar with Java virtual machine (JVM) and multi-threaded processing.
- Analyzing the clients existing Hadoop infrastructure and understand the performance bottlenecks and provide the performance tuning accordingly.
- A visionary leader with good communication, team building and management, interpersonal & analytical skills.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, Mapreduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, Zoo Keeper, Spring-XD, Hawq, Hue, Web-Hcat, Alpine, Cloudera Manager, Ambari, Pivotal Command Center, BDE (Big Data Extension).
Security: Kerberos
Storage: Isilon
Scripting Languages: Shell Scripting
Monitoring Tools: Nagios, Ganglia, Cloudera Manager, Ambari, Pivotal command center.
Operating Systems: Linux RHEL/Ubuntu/CentOS, Windows (XP/7/8).
PROFESSIONAL EXPERIENCE
Confidential - San Diego, CA
Sr. Hadoop Administrator
Responsibilities:
- Manage multiple clusters us. Manage and Operate Multiple Peta Byte Clusters.
- Automate installation of Hadoop using Chef, puppet.
- Performance tuning of HDFS, YARN.
- Manage and monitor cluster using nagios, check mk. Manage PCC command center.
- Writing Utilities for the big data platform, code review, define standards, best practices, Operations, regular maintenance, software upgrades.
- Provide highest level of Technical Support to User base.
- Accountable for taking direct support cases from Data Services users & customers. Primary responsibility of providing support for Client businesses and customers.
- Responsible for managing support cases on a daily basis including triage, isolating and diagnosing the problem, ensuring issues are reproducible, and subsequent resolution of the issue.
- Coordination between different teams involving release management, configuration management and Change Management.
- Responsible for building the knowledge base to prevent recurrence of escalation for previously resolved case.
- Continuous improvement of Incident Handle Times, First Contact Resolution, Escalation Rates, Self-Service/Community experience.
- Help customer setup Ecosystem as per their use cases and Debugged Ecosystems issues (HDFS, MapReduce, Yarn, Talend, Kerberos, Sqoop, TABLEAU, Hive, Hbase, Oozie, flume, Hue, Pig Drill).
- High-availability of all the applications on the production cluster and 24X7 technical support.
- Downtime management.
- Participate and provide feedback for capacity planning.
- Ensuring Support SLAs are met.
- Defining processes around Change Management, Release Management, and Application Transition across the Hadoop Platform.
- Process definitions and ensuring best practices / guidelines are met.
Environment: HADOOP HDFS, MRV2, YARN, PIG, HIVE, TEZ, SQOOP, R-SERVER, ISILON, RABBIT-MQ. BDE (Big Data Extension).
Confidential - San Diego, CA
Hadoop Administrator
Responsibilities:
- Responsible for building a cluster for storing > 2pb of Transactional data.
- Worked on developing scripts for performing benchmarking with Terasort/Terragen.
- Importing Data into HDFS from relational database.
- Developed Hive queries to process the data for analysis by imposing read only structure on the stream data.
- Used Ganglia to monitor the cluster around the clock.
- Implemented Fair schedulers to share the resources of the cluster for the map.
- Imported data from MySQL server to HDFS using Sqoop.
- Manage the day to day operations of the cluster for backup and support.
- Creating and managing Logical volumes. Using Java JDBC to load data into MySQL.
- Installing and updating packages using YUM.
- Installation and configuration of Linux for new build environment.
- Created volume groups logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
- Deep understanding of monitoring and troubleshooting mission critical Linux machines.
- Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
- Ensured data recover by implementing system and application level backups.
- Performed various configurations which include networking and Iptables, resolving host names and SSH keyless login.
- Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
- Automate administration tasks through the use of scripting and Job Scheduling using CRON.
Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, CLOUDERA MANAGER
Confidential, Bloomington, IL
Hadoop Administrator
Responsibilities:
- Responsible for building a cluster for storing > 1.5PB of data.
- Deployed a Hadoop cluster using cdh3 integrated with Nagios and Ganglia.
- Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
- Performed operating system installation, Hadoop version updates using automation tools.
- Upgraded the Hadoop cluster from cdh3 to cdh4.
- Deployed high availability on the Hadoop cluster quorum journal nodes.
- Implemented automatic failover zookeeper and zookeeper failover controller.
- Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
- Configured Oozie for workflow automation and coordination.
- Implemented rack aware topology on the Hadoop cluster.
- Good experience in troubleshoot production level issues in the cluster and its functionality.
- Backed up data on regular basis to a remote cluster using distcp.
- Regular Ad-Hoc execution of Hive and Pig queries depending upon the use cases.
- Commissioning and Decommissioning of nodes depending upon the amount of data.
- Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, CLOUDERA MANAGER
Confidential - Pleasanton, CA
Sr. LINUX/MYSQL Administrator
Responsibilities:
- Installation and configuration of Linux for new build environment.
- Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
- Installed Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
- Monitoring the System activity, Performance, Resource utilization.
- Configured NFS, DNS.
- Updating YUM Repository and Red hat Package Manager (RPM).
- Installation and configuration of MySQL databases.
- Set up MYSQL database Replication master to master and master to slave.
- Proficient in SQL application of MYSQL Tuning and Performance and maintaining large volume of data includes MYSQL 5.0,5.1,6.0
- Excellent command in creating Backups & Recovery process, understanding of Innodb, myiasm, memory engines of Mysql database.
- Data warehousing and extraction of data using MySQL dump, flat files export and import method.
- MySQL performance tuning on database, sever and sql query level. Slow query log analyze for query taking long time.
- Increased database performance by utilizing MySQL config changes, multiple instances and by upgrading hardware.
- Assisted with sizing, query optimization, buffer tuning, backup and recovery, installations, upgrades and security including other administration functions as part of profiling plan.
- Ensured production data being replicated into data warehouse without any data anomalies from the processing databases.
- Worked with the engineering team to implement new design systems of databases used by the company.
- Effectively configured MySQL Replication as part of HA solution.
- Designed databases for referential integrity and involved in logical design plan.
Environment: MYSQL 5.6/5.5/5.1, NAGIOS, MYSQL FABRIC, MYSQL WORKBENCH.
Confidential
LINUX/MYSQL Administrator
Responsibilities:
- Installation and configuration of Linux for new build environment.
- Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
- Installed Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
- Deep understanding of monitoring and troubleshooting mission critical Linux machines.
- Experience with Linux internals, virtual machines, and open source tools/platforms.
- Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
- Ensured data recoverability by implementing system and application level backups.
- Performed various configurations which include networking and IPTables, resolving hostnames, SSh key less login.
- Performed scheduled backup and necessary restoration
- Installation of MYSQL (5.34/5.5/6) databases on Red hat Linux.
- Created Mysql Database Backups and tested restore process on Test Environment
- Created Database users and setup permissions on Dev and Test Servers
- Setting up Replication master-slave, master-master and cascade replication for backup and reporting.
- Created Mysql databases on production and development servers.
- Verify free space host availability on (backup/archive) directories.
- Monitored the database size and increased the size when required, analyzed the database tables and indexes and then rebuilt the indexes if were fragmentations in indexes.
- Check completion results for any scheduled jobs, corns or data processing including refreshes.
- Review daily invalid object reports for all databases.
- Created users, allocation of appropriate table space quotas with necessary privileges and roles for MYSQLdatabases.
Environment: MYSQL 5.1.6, PHP MYSQL ADMIN, PHP 4.X, 5.X, TOAD, SHELL SCRIPT, LINUX.
