We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

PhoeniX

PROFESSIONAL SUMMARY:

  • 9 years of professional experience including 3 years of experience in Hadoop administration.
  • As a Hadoop administration responsibilities include software installation, configuration, software updates, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster on healthy on different Hadoop distributions (Hortonworks and Cloudera).
  • Experience in installation, management and monitoring of Hadoop cluster using Cloudera Manager and Ambari.
  • Strong experience in configuring Hadoop ecosystem tools with including Pig, Hive, Hbase, Sqoop, Flume, Kafka, Spark, Oozie, and Zookeeper.
  • Installed and configured HDFS (Hadoop Distributed File System), MapReduce and developed multiple MapReduce jobs for data cleaning.
  • Strong understanding on Hadoop architecture and MapReduce framework.
  • Experience in deploying Hadoop 2.x (YARN).
  • Optimized the configurations of Map Reduce, Pig and Hive jobs for better performance.
  • Worked on Building real time pipeline for streaming data using Kafka Streaming.
  • Expertise using Apache Spark fast engine for large - scale data processing.
  • Experience in transferring data between HDFS and Relational Database with Sqoop.
  • Experience in configuring Hadoop based monitoring tools- Nagios, Ganglia.
  • Experience in cluster maintenance, bug fixing, and troubleshooting monitoring and followed proper backup and recovery strategies.
  • Experience on commissioning, decommissioning, balancing and managing nodes and tuning server for optimal performance of the cluster.
  • Good experience in Hadoop cluster capacity planning, performance tuning, and cluster monitoring and troubleshooting.
  • Experience on setting up automatic failover control and manual failover control using Zookeeper and quorum journal nodes.
  • Implement and manage Secure Authentication mechanism for Hadoop clusters using Kerberos.
  • Good Experience in setting up the Linux environments, Passwordless SSH, Creating file systems, disabling firewalls, swappiness, Selinux and installing Java.
  • Experience in Customizing to use Active Directory, LDAP for authentication.
  • Managed various environments like CentOS, Redhat Linux and Windows server2008/2012.
  • Experience in monitoring and managing performance of ESx servers and Virtual Machines.
  • Good experience on Cisco UCS (Unified Computing System) Manager.
  • Good knowledge on Amazon Web Services (AWS).
  • Hands on experience on cluster upgradation and patch upgrade without any data loss and with proper backup plans.
  • Superior skills in communication, strong initiative for learning new skills and conquering challenges.

TECHNICAL SKILLS:

Hadoop ECOSYSTEM: HDFS, MapReduce, Hive, Pig, Sqoop, Oozie, Flume, Spark, Zookeeper and Kafka.

SCripting: Shell Scripting, python, Java, Scala

CLUSTER MANAGEMENT TOOLS: Ambari, Cloudera Manager.

MONITORING TOOLS: Nagios, Ganglia, Icinga2, Vops, Frameflow, and Automic.

DATABASE: Oracle, MySQL, MS SQL Server 2008/2012

SERVERS: Apache Tomcat Server, Apache HTTP Web Server

Operating system: Linux 5.x, 6.x, 7, Windows Servers 2008/2012/2016. ESx 5.x, 6.x

HARDWARE: Cisco UCS C210/C220/240, SAN, NAS storage, Dell Emc.

PROFESSIONAL EXPERIENCE:

Confidential, Phoenix

Hadoop Administrator

Responsibilities:

  • Involved in installing, configuring and using Hadoop Ecosystems with different Hadoop Distributors (Hortonworks, Cloudera).
  • Responsible in administrating and maintaining Hadoop cluster HDP 2.x
  • Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage & review Hadoop log files.
  • Configure and tune environment and batch jobs to ensure optimum performance and 99.99% availability.
  • Monitoring systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement.
  • Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • Monitoring systems and services throughAmbari dashboard to make the clusters available for the business.
  • Working on Ambari-alerts configuration for various components and managing the alerts.
  • Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
  • Manage to coordinate with Development, Network, Infrastructure, and other organizations necessary to get work done.
  • Worked with big data developers and designers in troubleshootingmapreducejob failures and issues with Hive and Pig.
  • Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper services.
  • Worked on Performance tuning forkafka cluster( failure and success metrics)
  • Built real time pipeline for streaming data using Kafka Streaming.
  • Performed data extractions from SQL server, HAWQ, Hive, X15 to support audit/expedited requests.
  • Used Spark Streaming with Kafka& HDFS to build a continuous ETL pipeline. This is used for real time analytics performed on the data
  • Used Apache Spark fast engine for large-scale data processing.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitoring work load, job performance and capacity planning using apache Hadoop.
  • Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map.
  • Involved in development/implementation of redhatHadoop environment.
  • Installation of OS on physical machines using UCS manager for Cisco servers.
  • Creating Service profiles, service profile templates, binding to a template, associating the service profiles in UCS Manager.
  • Opening Cisco TAC cases for troubleshooting the hardware issues.
  • Worked with the applications team to install the operating systems, Hadoop updates, patches and version upgrades as required.

Environment: RedHatlinux 6.x, ESx 6.x, 5.x, Windows Server 2008, 2012, Hortonworks (HDP) 2.4, Ambari 2.2.1, Apache Hive 0.12.0, Apache zookeeper-3.4.5, HDFS, MapReduce, YARN, MySql, python, Hawq, Kafka 0.9, Spark 1.6, Cisco UCS. Frameflow, Automic.

Confidential, Newark, CA

Hadoop Administrator

Responsibilities:

  • Responsible for the installation, configuration, maintenance and troubleshooting of Hadoop Cluster. Duties included monitoring cluster performance using various tools to ensure the availability, integrity and confidentiality of application and equipment.
  • Experience in installing and configuring RHEL servers in Production, Test and Development environment and used them in building application and database servers.
  • Deployed the Hadoop cluster in cloud environment with scalable nodes as per the business requirement.
  • Installed, configured and optimized Hadoop infrastructure using Cloudera Hadoop distributions CDH5.
  • Monitored workload, job performance and capacity planning using the Cloudera Manager Interface.
  • Improved the Hadoop cluster performance by considering the OS kernel, Disk I/O, Networking, memory, reducer buffer, mapper task, JVM task and HDFS by setting appropriate configuration parameters.
  • Experience in commissioning and decommissioning nodes of Hadoop cluster.
  • Imported the data from relational databases into HDFS using Sqoop.
  • Performed administration, troubleshooting and maintenance of ETL and ELT processes.
  • Managing and reviewing Hadoop log files and supporting MapReduce programs running on the cluster.
  • Involved in creating Hive tables, loading data, and writing Hive queries
  • Involved in upgrading Hadoop cluster from current version to minor version upgrade as well as to major versions.
  • Scheduled jobs using OOZIE workflow.
  • Installed and configure Zookeeper service for coordinating configuration-related information of all the nodes in the cluster to manage it efficiently.
  • Developed PIG and HIVE scripting for data processing on HDFS.
  • Experience in managing the cluster resources by implementing fair and capacity scheduler.
  • Formulated procedures for planning and execution of system upgrades for all existing Hadoop clusters.
  • Work with the data center planning groups, assisting with network capacity and high availability requirements.
  • Performed software installation, upgrades/patches, performance tuning and troubleshooting of all the servers in the clusters.

Environment: RedHat Linux 6.x, CDH 5.0.6 based on Apache Hadoop 2.3.0, Apache Hive-0.12.0, Apache Pig -0.12.0, Apache ZooKeeper -3.4.5, HDFS, MapReduce, YARN, MySQL, Cloudera Manager.

Confidential, Minneapolis

Hadoop Administrator

Responsibilities:

  • Responsible for architecting Hadoop cluster.
  • Involved in source system analysis, data analysis, data modeling to ETL (Extract, Transform and Load) and HiveQL
  • Strong Experience in Installation and configuration of Hadoop ecosystem like Yarn, HBase, Flume, Hive, Pig, Sqoop.
  • Expertise in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
  • Manage and review Hadoop Log files.
  • Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
  • Worked extensively with Sqoop for importing data.
  • Designed a data warehouse using Hive.
  • Created partitioned tables in Hive.
  • Mentored analyst and test team for writing Hive Queries.
  • Extensively used Pig for data cleansing.
  • Scheduled Oozie workflow engine to run multiple Hive and Pig jobs, which independently run with time and data availability.
  • Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.

Environment: RedHat Linux 6.x, Hive, Pig, oozie, sqoop, ZooKeeper, HDFS, MapReduce, YARN, Hbase, MySQL.

Confidential

Hadoop & Linux Administrator

Responsibilities:

  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, data backups, Manage& review log files.
  • Monitoring systems and services through Ambari dashboard to make the clusters available for the business.
  • Changing the configurations based on the requirements of the users for the better performance of the jobs.
  • Experienced in Ambari-alerts configuration for various components and managing the alerts.
  • Installing and maintaining the Linux servers.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.
  • Creation, Installation and administration of Red Hat Virtual machines in VMware Environment.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers.
  • Running Cron-tab to back up data.
  • Adding, removing, or updating user account information, resetting passwords, etc.
  • Performance tuning of Virtual Memory, CPU, system usage in Linux and Solaris servers.
  • Supporting infrastructure environment comprising of RHEL and Solaris and AIX.
  • Involved in development/implementation of Centos Hadoop environment.

Environment: RedHat Linux, CDH, HDP, Apache Hive, Apache Pig, ApacheZooKeeper, HDFS, MapReduce, YARN, MySQL, Cloudera Manager.

Confidential

Linux Administrator

Responsibilities:

  • Building and supporting environments consisting of Testing, Contingency, Production and Disaster Recovery servers.
  • Worked on ESX/ESXi installation and ESX /ESXi upgrades.
  • Configuring vNetwork Distributed switches and migrating the networks from VNetwork standard switches to vNetwork Distributed switches.
  • Managing Access Control through vCenter server, Optimizing Resource Utilization, Load Balancing (VMware DRS) and ensuring High Availability (VMware HA).
  • Provision of Windows and Linux VMs as per the requirements.
  • Create, extend, reduce and administration of Logical Volume Manager (LVM) in RHEL environment.
  • Creation, Installation and administration of Red Hat Virtual machines in VMware Environment.
  • Configuration of Network bonding which include Active/Standby and Active/Active.
  • Troubleshooting Network, memory, CPU, swap and File system issues, TCP/IP, NFS, DNS, SMTP in Linux and Solaris servers.
  • Performance tuning of Virtual Memory, CPU, system usage in Linux and Solaris servers.
  • Performance Monitoring and Performance Tuning using Top, prstat, SAR, vmstat, ps, iostat.
  • Package management using RPM, YUM and UP2DATE in RHEL
  • Performed Disaster Recovery in RHEL servers which consists of LVM based FS and Red Hat Clustering.
  • Installation, configuration and administration of Jboss, Apache, Tomcat and Web Sphere.
  • Develop, Maintain, update various scripts for services (start, stop, restart, recycle, cron jobs) using shell and Bash scripts
  • Schedule jobs using Crontabs on Linux servers
  • Updated & Running the various source codes for migration & updating follow-up the release management.
  • User, Group, Package administration, various repetitive activities across Linux Environment.
  • Worked on storage Area Networks (SAN) and Network Attached storage (NAS).
  • Active Directory management, Group Policy administration, maintenance of DHCP, DNS and other application servers.

Environment: Red Hat Enterprise Linux 4.x/5.x/6.1, AIX 6.x, Solaris 8/ 9/10, RGEL, Tivoli Storage Manager, VMware ESX5, Tivoli Net backup, and Web sphere.

Confidential

System Administrator

Responsibilities:

  • Providing windows and VMWare Level 2 & Level 3 support
  • Administration of Active Directory in Windows 2003/2008 environment. Managing Active Directory issues such as: Logon Failures, Account Lockouts, Group Policies, Network Connectivity, DNS and WINS Name Resolution, Authentication Problems, File and Printer Permissions etc.
  • Implemented automated scripts using Bash, Perl scripting as when required at various levels.
  • Worked exclusively on VMware virtual environment.
  • Experience in using VMware vMotion and storage vMotion.
  • Applying windows updates using Wsus servers.
  • Creation and maintaining user accounts and groups in AD
  • Involved in installation and configuration of various Third party software onto servers.
  • Installed, configured and provided support for Tivoli Monitoring software across various OS platforms like RHEL, AIX and Solaris.
  • Installed packages using YUM and Red hat Package Manager (RPM) on various servers.
  • Worked with Red Hat Satellite Server which is used to push changes across various servers simultaneously.
  • Performed the daily system administration tasks like managing system resources and end users support operations and security.
  • Troubleshooting network administration, IIS configuration, DNS setup and modifications, firewall rule sets, local and distributed director, connectivity, and supporting applications.
  • Responsible for independent support of Tier 2 issues: reboots, start/stop services, reset Terminal Service and pc anywhere connections, and administrative server maintenance. Daily follow up with clients to ensure resolution of all issues.
  • Building Windows 2012/2008/2003 Servers using SOE on standalone and blade servers based on application requirement.
  • Planning and scheduling with hardware vendors for faulty hardware replacement
  • Used various networking tools such as ssh, telnet, rlogin, ftp and ping to troubleshoot daily issues. Also responsible to design, implement and maintain DNS, NFS and FTP services. Monitored server and application performance & tuning via various stat commands (top, mpstat, prstat, nfsstat, prtconf, prtdiag, iostat, top, printmgr, hpimliview, dmidecode, smc etc.) and tuned I/O, memory etc. for SUN Solaris and RHEL Servers.
  • Day to day resolution on Linux based issued though SMS ticketing system in compliance to SLA cycles.

Environment: Red Hat Enterprise Linux 4.x/5.x/6.1, AIX 6.x, Solaris 8/ 9/10, RGEL, Tivoli Storage Manager, VMware ESX5, Tivoli Net backup, and Web sphere. Windows 2003/2008/2012 servers

We'd love your feedback!