Hadoop Admin Resume
Chicago, LL
PROFESSIONAL SUMMARY
- 7+ years of professional IT experience which includes over 4 years of hands on experience in Hadoop Administration using Cloudera (CDH) and Hortonworks (HDP) Distributions on large distributed clusters.
- Experience with Cloudera distributed Hadoop, Hortonworks and knowledge on MapR.
- Experience in deploying, maintaining, monitoring, troubleshooting and upgrading Hadoop Clusters.
- Strong knowledge on Hadoop components like Hadoop Map Reduce, HDFS, YARN, Zoo Keeper, Oozie, Hive, HBase, Sqoop, Pig, Flume.
- Good understanding of Hadoop HDFS architecture, cluster planning and Map - Reduce framework, both mrv1 & mrv2 (YARN).
- Experience in importing and exporting data between different databases and HDFS using Sqoop and performing data transformations using Hive and Pig.
- Excellent programming skills to write SQL queries, HIVE queries.
- Experience in configuring and enabling cluster coordination services with zookeeper.
- Experience in using Flume to load log files into HDFS, Oozie for configuring job flows and used Flume to collect large log data.
- Experience in setting up and working with HBase.
- Strong knowledge of setting up High-Availability and recovering name node metadata residing in cluster.
- Experience in commissioning and decommissioning and performing major and minor cluster upgrades to latest release and installing hadoop patches.
- Experience in configuring and enabling Kerberos security and LDAP integration.
- Excellent knowledge in developing Shell Scripts to check file system health and automate system management tasks.
- Experienced in Performance Monitoring and Tuning of Hadoop cluster.
- Experienced in analyzing big data using Hadoop environment.
- Knowledge on Installation and configuration of Spark.
- Experience in setting up Disaster Recovery.
- Updating Redhat system using RPM and YUM.
.TECHNICAL SKILLS:
- Big Data Ecosystem: HDFS
- HBase
- Hadoop MapReduce
- Zookeeper
- Hive
- Pig
- Sqoop
- Flume
- OozieDatabase: SQL Server 2000/2005/2008 R2
- MS-Access
- ORACLE 10g/9i
- MySQL Scripting Languages: Shell scripting
- Java Scripting
- UNIX shell scripting
- Python
- SQL
- PIG LATINOperating Systems: UNIX
- Linux
- AIX
- Windows XP
- Server 2000
- 03
- Server 2008.Hadoop Configuration: Horton works
- Cloudera
- Hadoop 2.4
- 2.5.Management: Cloudera Manager
- Ambari
PROFESSIONAL EXPERIENCE:
HADOOP ADMIN
Confidential, Chicago, lL
Responsibilities:
- Installed, Configured and Maintained Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
- Extensively worked with Cloudera Distribution Hadoop, CDH 5.x, CDH4.x
- Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of the Hadoop Cluster.
- Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
- Collected the logs data from web servers and integrated into HDFS using Flume.
- Implement both major and minor version upgrades to the existing cluster and also rolling back to the previous version.
- Created Hive tables to store the processed results in a tabular format.
- Used Oozie workflows to automate jobs on Amazon EMR.
- Utilized cluster co-ordination services through ZooKeeper.
- Configuring Sqoop and Exporting/Importing data into HDFS.
- Implement Flume, Spark,Spark Stream framework for real time data processing.
- Implemented Proofs of Concept on Hadoop and Sparkstack and different big data analytic tools, usingSparkSQL as an alternative to Impala
- Used Sqoop to import and export data from RDBMS to HDFS and vice-versa.
- Integrated Kerberos into Hadoop to make cluster more strong and secure from unauthorized users.
- Spin up EMR clusters with required EC2 Instance types understanding job type and data size.
- Updated cloudformation templates with IAM roles for S3 bucket access, security groups, subnet ID, EC2 Instance Types, Ports, and AWS Tags. Worked on bitbucket, git and bamboo to deploy EMR clusters.
- Involved in updating scripts and step actions to install ranger plugins.
- Debug spark job failures and provided workarounds.
- Used Splunk to analyze job logs and ganglia to monitor servers.
- Involved in enabling ssl for hue on prem CDH cluster.
- Written shell scripts and successfully migrated data from on Prem to AWS EMR (S3).
Environment: CDH 5.7.6, Cloudera Manager, Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, Red hat/Centos 6.5
HADOOP ADMIN
Confidential, Irvine, CA
Responsibilities:
- Actively involved in installing, configuring and monitoring Cloudera Hadoop cluster.
- Participated in configuring Hadoop cluster on AWS.
- Developed cost effective and fault tolerant systems using AWS EC2 instances, auto scaling and Elastic Load Balance.
- Responsible for commissioning and decommissioning, troubleshooting and load balancing.
- Implemented High Availability for Name Node and configured zookeeper for coordination services.
- Importing and exporting data in and out of HDFS and HIVE using SQOOP, also loaded data from UNIX file system to HDFS.
- Integrated tools SAS and Tableau with hadoop for easier access of data from HDFS and HIVE.
- Installed and configured HIVE, involved in writing HIVE UDFs.
- Set up 60 node Cloudera 5.4 Disaster Recovery cluster.
- Successfully upgraded Cloudera hadoop cluster from CDH 5.4 to CDH 5.6.
- Supported in managing and reviewing of hadoop log files and data backups.
- Implemented Kerberos and LDAP for authentication and security of the hadoop cluster.
- Enabled resource management using fair scheduler.
- Developed shell scripts for monitoring file system health, running balancer and automate other tasks.
- Involved in writing python scripts to move data into S3 buckets.
- Provided highly available and durable data using AWS S3 data store.
Environment: Cloudera 5.6, HDFS, Hive, Sqoop, Zookeeper and HBase, Windows 2000/2003 Unix Linux, HDFS, Map Reduce, Pig, Hive, HBase, Flume, Sqoop, Shell Scripting
.HADOOP ADMIN
Confidential, Naperville, IL
Responsibilities:
- Worked on analyzing Hortonworks Hadoop cluster and different big data analytic tools including Pig, HBase Database and Sqoop.
- Responsible for building scalable distributed data solutions using Hadoop.
- Experienced on setting up Horton works cluster and installing all the ecosystem components through Ambari and manually from command line.
- Performed a Major upgrade in production environment from HDP 1.3 to HDP 2.2 and followed standard Back up policies to make sure the high availability of cluster.
- Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Ambari.
- Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, troubleshooting, Cluster Planning, Manage and review data backups, Manage & review log files.
- Experienced on setting up Horton works cluster and installing all the ecosystem components through Ambari and through command line interface.
- Developed scripts for tracking the changes in file permissions of the files and directories through audit logs in HDFS.
- Worked intact for integrating the LDAP server and active directory with the Ambari through command line interface.
- Loaded data from UNIX file system to HDFS.
- Cluster coordination services through Zookeeper.
- Experience in managing and reviewing Hadoop log files.
- Installed Oozie workflow engine to run multiple MapReduce, Hive and pig jobs.
- Balancing HDFS manually to decrease network utilization and increase job performance.
- Set up automated processes to archive/clean the unwanted data on the cluster, in particular on HDFS and Local file system.
Environment: Hortonworks 2.2.1, HDFS, Hive, Pig, Sqoop, HBase, Zookeeper, Oozie, Ubuntu, RedHat Linux.
HADOOP ADMIN
Confidential, Irvine, CA
Responsibilities:
- Install, Configure and maintain Single-node and Multi-node cluster Hadoop cluster.
- Setup cluster environment for Highly Available systems.
- Hadoop cluster configuration & deployment to integrate with systems hardware in the data center.
- Troubleshooting, diagnosing, tuning, and solving Hadoop issues.
- Monitor a Hadoop cluster and execute routine administration procedures.
- Managing hadoop Services like Namenode, Datanode, Jobtracker, Tasktracker etc.
- Installed Apache Hadoop 2.5.2 and Apache Hadoop 2.3.0 on Linux Dev servers
- Installed PIG, HIVE on multi-node cluster.
- Integrating PIG, Hive, Sqoop on Hadoop.
- Monthly Linux server maintenance, shutting down essential Hadoop name node and data node.
- Made Hadoop cluster secured by implementing Kerberos.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Balancing Hadoop cluster using balancer utilities to spread data across the cluster equally.
- Implemented data ingestion techniques like Pig and Hive on production environment.
- Commissioning and decommissioning of Hadoop nodes.
- Involved in Cluster Capacity planning along with expansion of the existing environment.
- Regular health checkups of the system using Hadoop metrics - Scripted.
- Providing 24X7 support to Hadoop environment.
Environment: Hadoop 1.x and 2.x, MapReduce, HDFS, Hive, SQL, Cloudera Manager, Pig, Sqoop, Oozie, CDH3 and CDH4, Apache Hadoop.
SYSTEMS ADMINISTRATOR
Confidential
Responsibilities:
- Installing, configuring, maintaining, administrating, and supporting on Red hat Linux.
- Experience in backup and restore operations with Linux Inbuilt utilities.
- Provided guidance for equipment checks and supported processing of security requests.
- Maintained the integrity and security of the Linux Servers.
- Installed Solaris and Red Hat enterprise Linux using Jumpstart/Kickstart servers.
- Created backup capabilities for the recovery of data.
- Actively involved in disaster recovery set up.
- Configured and maintained network services for the enterprise.
- Managed UNIX account maintenance including additions, changes, and removals.
- Used RPM in several Linux distributions such as Red Hat Enterprise Linux, SUSE Linux Enterprises and Fedora.
- Extensive experience in building servers using Jumpstart and Kick-start Process.
- Worked in configuration of Redhat Satellite server and Managed, configured and Maintained customer entitlements including upgrading and patching of Linux servers
- Configured Users & Security administration, backup, recovery and maintenance of various activities.