Hadoop Administrator Resume
San Francisco, CA
SUMMARY
- Over 7 years of experience in IT industry which includes 3+ years of proven experience in Hadoop Administration using Cloudera and Hortonworks Distributions and 3+ Years of experience in System Administration and IBM Websphere Administration.
- Experience in installation, configuration, supporting and monitoring Hadoop cluster using Cloudera and HortonWorks distributions.
- Extensive noledge on Hadoop HDFS architecture and MRv1, MRv2 (YARN) framework.
- Over three years of experience in design, development, and maintenance and support of Big Data Analytics using Hadoop Ecosystem components like HDFS, Hive, Pig, Hbase, Sqoop, Flume, Zookeeper, Map Reduce, and Oozie.
- Managing the health of Cluster, resolving performance related issues, coordinating with various parties for Infrastructure Support.
- Expertise in Commissioning and Decommissioning the nodes in Hadoop Cluster.
- Extensive experience in configuring Namenode High Availability and Namenode Federation.
- Experience with performance tuning on HDFS, MapReduce and hive jobs.
- Experience in managing and reviewing Hadoop Log files.
- Setting up HDFS Quotas to enforce the fair share of computing resources.
- Experience in Rebalance an HDFS Cluster.
- Expertise in benchmarking, performing backup and disaster recovery of Name Node metadata and important and sensitive data residing on cluster.
- Strong Knowledge in Configuring and maintaining YARN Schedulers ( Fair, and Capacity).
- Experience in setup, configuration and management of security for Hadoop clusters using Kerberos
- Successfully loaded files to Hive and HDFS from Oracle, SQL Server, MySql, and Teradata using Sqoop.
- Loaded streaming log data from various web servers into HDFS using Flume.
- Experience on creating databases, tables and views in HIVE, IMPALA.
- Created Hive internal and external tables defined with appropriate static and dynamic partitions, intended for efficiency.
- Experience in using the web User Interface HUE.
- Experience in Creating and managing HBase clusters dynamically using Slider and Start & Stop HBase clusters running on Slider.
- Good Experience with job workflow scheduling like Oozie.
- Extensive Knowledge on installation and configuration of Spark standalone mode for Testing and Development Environments.
- Strong Knowledge on Spark concepts like RDD Operations, Caching and Persistence.
- Experience in Upgrading Apache Ambari, CDH and HDP Cluster.
- Extensive noledge in using job scheduling by Oozie and Centralized service Zookeeper.
- Monitoring and support through Nagios and Ganglia.
- Experience in User and Group management of HortonworksAmbari.
- 2 years of experience in WebSphere Application Server, versions 6.x, 7 on Linux and windows, for Development, QA and Production Environments.
- WebSphere Server Administration tasks such as site monitoring experience in installation, configuring servers, Clustering, Deploying applications, troubleshooting and Maintenance of WebSphere server 6.0 and 6.1.
- Applied fixes (Refresh packs, fix packs, interim fixes) for WebSphere Application Server, SDK, plug - in and HTTP Server using update installer.
- Performing application deployments using EAR / WAR files, as requested by the application teams on the respective environments.
- Federated multiple nodes to the Deployment Manager.
- Created the Virtual hosts, JDBC Providers, Data Sources, Shared libraries referenced by the application during runtime.
- Used diagnostic tools like IBM thread and monitor dump analyzer, IBM heap, IBM Support Assistant Workbench 4.1 etc to analyze the IBM java cores, heap dumps and verbose GC logs to diagnose and find root cause of the issues.
- Expertise in Collaborating across Multiple technology groups and getting things done.
- Worked in a 24x7 on-call Production Support Environment.
TECHNICAL SKILLS
Hadoop Core: Hadoop Ecosystem
Hadoop Distribution: HDFS, MRv1, MRv2 (YARN).
Hadoop Security: Hive, Sqoop, Zookeeper, HBase, Oozie, Spark, Pig, Impala, Slider and Ganglia.
Cloudera, Hortonworks.: Kerberos (MIT).
Operating Systems: Windows, Linux (CentOS 5,6 RHEL 6)
Databases: Oracle 10g/11g, Sql Server, MySql
Application Server: IBM WebSphere Application Server 6.1/7.
Languages: Java, Sql, Shell Scripting and Python.
WebServer: Apache Webserver, IBM HTTP Server.
PROFESSIONAL EXPERIENCE
Hadoop Administrator
Confidential, San Francisco, CA
Responsibilities:
- Handle the installation and configuration of a Hadoop cluster.
- Monitored health of all the Processes related to Name Node HA, HDFS, YARN, HIVE, HBASE, Sqoop, Hue and Slider using HortonWorksAmbari.
- Monitored disk, Memory, Heap, CPU utilization on all Master and Slave machines using Ambari and took necessary measures to keep the cluster up and running on 24/7 basis.
- Configured Capacity Scheduler to provide service-level agreements for multiple users of a cluster.
- Used Hive and created Hive tables and loaded data from Local file system and HDFS.
- Created Hive internal and external tables defined with appropriate static and dynamic partitions, intended for efficiency.
- Worked on installing cluster, commissioning & decommissioning of data nodes.
- Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
- Involved in Installing Kerberos and Configuring Server, Client Systems to enable Hadoop Security.
- Creating and deploying Kerberos keytab Files, creating principals, realm.
- Imported logs from web servers with Flume to ingest the data into HDFS.
- Created HBase tables and involved in loading data into those tables.
- Creating and managing HBase clusters dynamically using Slider.
- Upgrading Apache Ambari from Version 1.7 to 2.0.
- Involved in upgrading Hadoop Cluster from HDP 1.3 to HDP 2.0.
- Involved in collecting metrics for Hadoop clusters using Ganglia and Ambari.
- Interacted with developers when we had to deploy new jobs, Jobs throwing exceptions, and Data related issues.
Environment: Java (JDK 1.7), MapReduce, HDFS, Pig, Hive, Hcatalog, HBase, Flume, Slider, Sqoop, OozieZookeeper, Ganglia and Hortonworks Ambari.
Hadoop Administrator
Confidential, Austin, TX
Responsibilities:
- Worked on setting up the Hadoop cluster for the dev, test and prod Environment.
- Worked on pulling the data from oracle databases into the hadoop cluster using the sqoop import.
- Worked with flume to import the log data from the reaper logs, sys log’s into the Hadoop cluster.
- Configured Fair Scheduler to provide fair resources to all the applications across the cluster.
- Worked with application teams to install Hadoop updates, patches, version upgrades as required.
- Involved in Cluster coordination services through Zookeeper.
- Manage and review data backups and log files.
- Configured services with the web interface Hue.
- Implemented the hadoop Namenode HA services to make the Hadoop services highly available.
- Upgrading the hadoop cluster from cdh3 to cdh4.
- Effectively used Sqoop to transfer data from databases (MySql, Oracle) to HDFS, Hive.
- Created Hive Managed and External tables defined with static and dynamic partitions.
- Used Ganglia to monitor the cluster around the clock.
- Supported Data Analysts in running Map Reduce Programs.
- Contribute to the creation and maintenance of system documentation.
Environment: Java (JDK 1.7), MapReduce, HDFS, Pig, Hive, HBase, Flume, Sqoop, Ooozie, Zookeeper, Nagios, Ganglia, and Cloudera Manager.
Middleware Administrator
Confidential, Irving, TX
Responsibilities:
- Installation, configuration, deployment, Administration and troubleshooting on IBM WebSphere Application Server.
- Installation, configuration, Administration on IBM HTTP Server.
- Responsible for J2EE Application Deployments, Plug-in Configuration, Data Source Creation, Virtual Host Creation, Session management, & clusters, Deployment Manager Configuration, Network Deployment Configuration in WebSphere 6.0./6.1 on UNIX and Windows.
- Federating nodes to deployment manager.
- Applied fixes (Refresh packs, fix packs, interim fixes) for WebSphere Application Server, SDK, plug- in and HTTP Server using update installer.
- Setup Vertical and Horizontal clusters for Workload Management (WLM), High Availability (HA) and Scalability.
- Configure jdbc drivers, data sources, and Global security.
- Experience with IBM Support Assistant Workbench 4.1 to automatically collect troubleshooting data from WebSphere Application Server.
- Troubleshooting and resolving the problem with in the entire MQ environment and if needed raising PMR request with IBM Tech support and stabilizing the product’s performance.
- Creation and implementation of changes in off business hours.
System Administrator
Confidential
Responsibilities:
- Setting User accounts and maintaining accounts with appropriate permissions.
- System Maintenance tasks such as performing backups and system updates.
- Verifying the working of peripheral devices.
- Monitoring system performance using shell based commands.
- Implementation of policies for the use of the computer system and network
- Involved in writing and reviewing various documents of day to day tasks.
- Implementing security of servers.
- Developing UNIX shell scripts per the demands of the environment
