We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Irvine, CaliforniA

SUMMARY:

  • Around 8 years of professional experience with Hadoop and Linux administration activities such as administering, upgrading, patching, configuring, troubleshooting and maintenance, performance monitoring and support of systems/clusters
  • Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters & Cloudera Hadoop Distribution.
  • Strong technical, administration, & mentoring knowledge in Linux, Big Data/Hadoop.
  • Experience in designing, installing and configuring complete Hadoop ecosystem (components such as pig, hive, oozie, HBase, flume, zookeeper).
  • Expertise in managing the Hadoop infrastructure with Cloudera Manager.
  • Providing security for Hadoop Cluster with Kerberos, Active Directory/LDAP, and TLS/SSL utilizations and dynamic tuning to make cluster available and efficient.
  • Involved in customer interactions, business user meetings, vendor calls and technical team discussions to take right choices in terms of design and implementations and to provide best practices for the organization.
  • Working experience on Hortonworks (HDP) and Cloudera distribution.
  • Experience in managing the cluster resources by implementing fair scheduler and capacity scheduler.
  • Experience in developing and scheduling ETL workflows in Hadoop using Oozie.
  • Experience in tools like puppet to automate Hadoop installation, configuration and monitoring.
  • Installation, patching, upgrading, tuning, configuring and troubleshooting Linux based operating systems RedHat and Centos and virtualization in a large set of servers.
  • Hadoop Ecosystem Cloudera, Hortonworks, Hadoop, MapR, HDFS, HBase, Yarn, Zookeeper, Nagios, Hive, Pig, and Ambari Spark Impala.
  • Experience in Install and configuration of Web hosting administration HTTP, FTP, NFS & SSH
  • Worked on Firewall implementation & Load balancer between various Windows servers.
  • Efficiency in installing, configuring and implementing the LVM, and RAID Technologies using various tools like Veritas volume manager.
  • Experience with system integration, capacity planning, performance tuning, system monitoring, system security, operating system hardening and load balancing.
  • Hands - on experience on major components in Hadoop Ecosystem including HDFS, Yarn, Hive, Impala, Flume, Zookeeper, Oozie and other ecosystem Products.
  • Regular disk management like adding /replacing hot swappable drives on existing servers/workstations, partitioning according to requirements, creating new file systems or growing existing one over the hard drives and managing file systems.
  • Installed and configured a Hortonworks HDP 2.2 using Ambari and manually through command line. Cluster maintenance as well as creation and removal of nodes using tools like Ambari, Cloudera Manager Enterprise and other tools.
  • Extensively worked on configuring & administering YUM, RPM's, NFS, DNS, and DHCP, Mail servers.
  • Experience in Install and configuration of Web hosting administration HTTP, FTP, NFS & SSH
  • Worked on Firewall implementation & Load balancer between various Windows servers.
  • Efficiency in installing, configuring and implementing the LVM, and RAID Technologies using various tools like Veritas volume manager.
  • Experience with system integration, capacity planning, performance tuning, system monitoring, system security, operating system hardening and load balancing.
  • Cloudera Manager Upgradation from 5.3 to 5.5 version.
  • Involved in the process of linux kernel upgrade in the cluster.

TECHNICAL SKILLS:

Big Data Technologies: HDFS, YARN, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper, Flume, Mongo-DB, Cassandra

Big Data Distributions: Cloudera, Hortonworks, Apache Hadoop

Scripting Languages: Shell, Bash

Monitoring tools: Grafana, Ganglia, Nagios, Ambari, Jenkins, Navigator, Netcool

Reporting Tools: Servicenow, Tableau, Jasper soft

Programming Languages: SQL, PL/SQL, Java, Chef, Puppet

Operating Systems: Linux, UNIX, MAC, Windows NT /98/2000/ XP Vista, Windows 7, Windows 8

PROFESSIONAL EXPERIENCE:

Confidential, Irvine, California

Hadoop Administrator

Responsibilities:

  • Configuring, Maintaining, and Monitoring Hadoop Cluster using Cloudera Manager (CDH5) distribution
  • Cloudera Manager Upgradation from 5.3 to 5.5 version
  • Involved in the process of linux kernel upgrade in the cluster
  • Balancing HDFS manually to decrease network utilization and increase job performance.
  • Rebalancing the cluster, as much of the production jobs are failing due to space issue on the datanodes.
  • Developed script to check the Space utilization of local file system directories on Gateway servers and Master Nodes
  • Contributed to building hands-on tutorials for the community to learn how to setup Hortonworks Data Platform (powered by Hadoop) and Hortonworks Data flow (powered by Nifi)
  • Developed script to check the load of applications running from yarn backend on adhoc job basis
  • Production jobs debugging when failed and providing a resolvence
  • Checking Namenode and Resource manager logs whenever they are unavailable and jobs are failing
  • Developed an Audit script to check data in Hive tables are sent correctly without mismatch when distcp between two clusters
  • Configuring Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Handle the data exchange between HDFS & Web Applications and databases using Flume and Sqoop.
  • Extracting the data from the Hive tables for data analysis
  • Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster
  • Supported technical team members in management and review of Hadoop log files and data backups.
  • Participated in development and execution of system and disaster recovery processes.
  • Monitoring Hadoop Cluster through Ambari and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics.
  • Performance Tuning, Client/Server Connectivity and Database Consistency Checks using different Utilities.
  • Performed periodic maintenance and CDH upgrade, using the parcel method.
  • Performed application validation post every maintenance to ensure that the applications are working as expected.
  • Experience with multiple Hadoop distributions like Apache, Cloudera and Hortonworks.
  • Experience in securing Hadoop clusters using Kerberos and Sentry.
  • Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
  • Experience managing users and permissions on the cluster, using different authentication methods.
  • Involved in regular Hadoop Cluster maintenance such as patching security holes & updating system packages.
  • Commission and decommission the Data nodes from cluster in case of problems.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster
  • Setting up Identity, Authentication, and Authorization

Confidential, California

Hadoop Administrator

Responsibilities:

  • Performed various configurations which Includes, networking and IP tables, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
  • Responsible for scheduling and upgrading these servers throughout the year to the latest versions of software
  • Deployed Hadoop cluster of Hortonworks Distribution and installed ecosystem components.
  • Implemented authentication and authorization service using Kerberos authentication protocol.
  • Performed benchmarking on the Hadoop cluster using different benchmarking mechanisms.
  • Tuned the cluster by Commissioning and decommissioning the Data Nodes.
  • Monitored multiple Hadoop clusters environments using Nagios and Ganglia.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for NameNode Metadata backup.
  • Performed a POC on cluster back using distcp, Cloudera manager BDR and parallel ingestion.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Development of Pig scripts for handling the raw data for analysis.
  • Maintained, audited and built new clusters for testing purposes using the Cloudera manager.
  • Deployed and configured flume agents to stream log events into HDFS for analysis.
  • Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
  • Custom shell scripts for automating redundant tasks on the cluster.
  • Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data into HDFS.
  • Troubleshoot and resolve issues with ETL/Data Ingest, Hive and Impala queries, Spark jobs and other related items by analyzing job logs and error files for Hadoop services in both Dev and Prod clusters.
  • Capacity planning and Architecture setup for Big Data applications.
  • Experience in doing benchmark testing and functionality testing for all services in Hadoop cluster while doing the cluster upgrades.
  • Building S3 buckets and managed policies for S3 buckets and used S3 bucket and Glacier for storage and backup on AWS
  • Collaborating with Linux and MySQL teams for OS level, security vulnerabilities patch implementations and to fix the Hadoop issues.
  • Hands on experience in installing, configuring MapR, Hortonworks clusters and installed Hadoop ecosystem components like Hadoop Pig, Hive, HBase, Sqoop, Kafka, Oozie, Flume, Zookeeper.
  • Installed Oozie workflow engine to run multiple Hive and Pig Jobs
  • Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Used Hive and created Hive external/internal tables and involved in data loading and writing Hive UDFs.
  • Exported the analyzed data to relational databases using Sqoop for visualization and to generate reports.
  • Involved in the installation of CDH3 and up-gradation from CDH3 to CDH4.
  • Responsible for developing data pipeline using HDInsight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.

Confidential, LA, California

Hadoop Administrator

Responsibilities:

  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop
  • Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory
  • Managing the configuration of the cluster to the meet the needs of analysis whether I/O bound or CPU bound.
  • Set up Cloudera Infrastructure from configuring clusters to Node
  • Performance tune and manage growth of the O/S, disk usage, and network traffic
  • Responsible for building scalable distributed data solutions using Hadoop
  • Involved in loading data from LINUX file system to HDFS
  • End-to-end performance tuning of Hadoop clusters and Hadoop MapReduce routines against very large data sets.
  • Communicated and worked with the individual application development groups, DBAs and the Operations
  • Created custom monitoring plugins for Nagios using UNIX shell scripting, and Perl.
  • Perform troubleshoot on all tools and maintain multiple servers and provide back up for all files and script management servers.
  • Installing, Upgrading and Managing Hadoop Cluster on Cloudera distribution
  • Installed and configured Hadoop HDFS, MapReduce, Pig, Hive, and Sqoop.
  • Wrote Pig Scripts to generate MapReduce jobs and performed ETL procedures on the data in HDFS.
  • Exported analyzed data to HDFS using Sqoop for generating reports.
  • Responsible to manage data coming from different sources.
  • Supported MapReduce Programs those are running on the cluster.
  • Installed and configured Pig and also written Pig Latin scripts.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Screen Hadoop cluster job performances and capacity planning
  • Monitor Hadoop cluster connectivity and security
  • Manage and review Hadoop log files
  • Perform architecture design, data modeling, and implementation of Big Data platform and analytic applications for the consumer products
  • Assist to configure and deploy all virtual machines and install and provide backup to all configuration procedures.

Confidential

Linux/Hadoop Systems Administrator

Responsibilities:

  • Installed, configured and administration of Tripwire on Redhat 5/6/7, Windows server and Centos operating systems.
  • Involved in Installing, Configuring and Upgrading of RedHat Linux AS 4/5/6, Solaris 9/10/11 operating systems.
  • Installation, Maintenance, Administration and troubleshooting of Red Hat Enterprise Linux 5/6 and Solaris 9/10 systems.
  • Remote monitoring and management of server hardware.
  • Involved in Server sizing and identifying, recommending optimal server hardware based on User requirements.
  • Configure and maintain Windows 2003 Server that runs DNS, DHCP and active directory services.
  • Wrote UNIX Shell, Perl, and bash scripts to perform admin tasks.
  • Used RPMs to install, update, verify, query and erase packages from Linux Servers.
  • Performed automated installations of Operating System using kickstart for Red Hat Enterprise Linux 5/6 and Jumpstart for Solaris 9/10 Linux.
  • Monitoring systems and services, architecture design and implementation of hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Changing the configurations based on the requirements of the requirements for the better performance of the jobs
  • Performed various configurations which include networking and IPTables, resolving hostnames, SSH key less login..
  • Maintained and configured VMware ESX server and Vcenter via web access.
  • Upgraded/Installed BMC patrol agents on all servers including Redhat Linux, Solaris & AIX in our environment.
  • Utilized AWS to set up Virtual Private Clouds (VPCs) for Management, Production and Testing environments with customer operational requirements and parameters (e.g., internet gateway, subnets, elastic IP, and Security Groups).
  • Worked as hadoop Administrator/Architect on 100 node clusters ranges from POC clusters to PROD with Hortonworks Distribution 2.2.4.2.
  • Provided Architectural solutions in implementing security within the clusters and while accessing it from hadoop-integrated tools.

Confidential

Linux Administrator

Responsibilities:

  • Installing, upgrading and configuring Red hat Linux Enterprise 6.5, 6.9 On HP and Dell Servers
  • Worked and performed data-center operations including rack mounting, cabling.
  • Knowledge in designing, installing, configuring and maintenance of Enterprise Networks using Cisco routers, Catalyst Switches and Load Balancers, Cisco Firewalls.
  • Managing Cron jobs, at Jobs, batch processing and job scheduling.
  • Installation, Configuration and Maintenance of samba, Apache Tomcat, Web Sphere and JBoss servers in AIX and Linux environment.
  • Documented various regular administrative tasks and backup procedures
  • Installed and configured Sudo for users to access the root privileges.
  • Strong knowledge and understanding in installing, configuring, managing and maintaining various monitoring tools such as Nagios, HP Openview, and Solaris wins.
  • Knowledge of configuring and managing Brocade 300 Switches
  • Support the BBG online cloud storage services on Amazon AWS by virtual creation of servers such as Linux (Ubuntu and Redhat), Windows.
  • Hands-on experience in installing, managing, and troubleshooting LAMP stack
  • Strong knowledge in installing and managing Vcenter server and vCSA.
  • Deploying and Troubleshooting the VM'S & ESX Hosts by using Virtual Center Server
  • Create and manages the Virtual Switches, DV-Switches & Port groups in virtual
  • Oversee and manage POC for Red Hat Satellite Server infrastructure.
  • Troubleshoot and resolve servers, workstations, and network-related issues in a complex Health Information system setting while meeting demands of a challenging environment.

Hire Now