We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

SUMMARY

  • Over 7 years of IT experience which includes experience in Big Data Ecosystem and Linux/Unix administration.
  • Experience in Installing, Configuring, Supporting and Managing of Cloudera and Hortonworks Hadoop platforms.
  • Experience in Installing and Configuring components HDFS, MapReduce, TEZ, Hive, Yarn, Sqoop, Flume, Oozie, Pig, HBase, Scala, Spark, Mahout, Knox, Kafka in Hadoop ecosystem.
  • Worked with Ambari and Cloudera Manager for Monitoring and Administrating the Hadoop Multi Node Cluster.
  • Expert in Administrating and Maintaining Hortonworks Hadoop clusters across all Production, UAT, Development and DR (Disaster Recovery) environments.
  • Experience in Administrating the Linux/Unix systems to deploy Hadoop Clusters.
  • Expertise in Monitoring multiple Hadoop Clusters environments using Ambari Metrics and Grafana.
  • Configuring Name Node, Resource Manager and HiveServer2 High Availability and Cluster service co - ordination using ZooKeeper.
  • Performing Backup, Disaster Recovery, Root Cause Analysis for Hadoop and Troubleshooting Hadoop Cluster issues.
  • Importing and Exporting Data between Relational Database Systems(RDBMS) and HDFS using Sqoop.
  • Configuring Flume for streaming data into HDFS Eco Systems.
  • Hands on experience in Configuring Kerberos for Authentication.
  • Expert in Configuring Authorization using Ranger for HDFS & Hive and defining Ranger policies.
  • Good knowledge in Benchmarking Hadoop Cluster using Hadoop benchmark techniques.
  • Experience in maintaining and monitoring the kerberized cluster.
  • Proficient in performing Minor and Major Upgrades, Commissioning and Decommissioning of data nodes on Hadoop cluster.
  • Experience in integrating Hadoop Cluster components with the LDAP, Active Directory and enabling SSL for Hadoop Cluster Components.
  • Experience in writing Hive queries using Hive Query Language HiveQL.
  • Strong working knowledge in writing queries in SQL.
  • Experience in monitoring the kerberized Hadoop cluster using Grafana, Nagios, Ganglia.
  • Working knowledge with NoSQL databases such as HBase, MongoDB, Cassandra.
  • Working knowledge in installing and configuring complete Hadoop eco system on AWS Cloud EC2 instance.
  • Experience in Installation, Configuration, Backup, Recovery, Maintenance, Support of Sun Solaris and Linux.
  • Working knowledge in configuring Apache NiFi on kerberized cluster.
  • Experience in building Apache Kafka cluster and integrating with Apache Storm for real time data analysis.
  • Working knowledge in analyzing large data sets using Hive, Pig Latin, HBase and Custom MapReduce programs in Java.
  • Installing, Upgrading and Configuring Red Hat Linux using Kickstart Servers and Interactive Installation.
  • Very good knowledge on Centrify. Knowledge on CI/CD tools such as Puppet, Chef, Jenkins, Ansible.
  • Extensive knowledge on Hadoop Application Frameworks like Tez and Spark.
  • Experience with System Integration, Capacity Planning, Performance Tuning, System Monitoring, System Security and Load Balancing.
  • Excellent knowledge in YUM and RPM Package Administration for Installing, Upgrading and Checking dependencies.
  • Working knowledge on AWS cloud and familiar with Azure cloud.
  • Familiar with data analytics tools such as Cognos and Tableau.
  • Very good knowledge on java programming. Familiar with python.
  • Knowledge on Sudoers configuration and administration on both Solaris and Red hat Linux Environment.

TECHNICAL SKILLS

Hadoop Distribution: Hortonworks, Cloudera, MapR.

Hadoop Ecosystem: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, RabbitMQ, ZooKeeper, HBase, Spark, Hue, Apache SOLR, Knox, Kafka, Mahout, Apache NiFi, Scala.

Configuration Management: Ambari, Cloudera Manager.

Security: Kerberos, Ranger, Centrify.

Languages: Java, SQL, Shell Scripting, bash, HiveQL, HTML.

Databases: MySQL, SQL Server, DB2, MS-SQL, Oracle.

NoSQL Databases: HBase, MongoDB, Cassandra.

Monitoring Tools: Grafana, Ganglia, Nagios.

Package Management: RPM, YUM.

Networking and Protocols: NFS, HTTP, FTP, NIS, LDAP, TCP/IP, DNS.

Other Relevant Tools: JIRA, Pivotal Tracker.

Operating Systems: RHEL, Ubuntu, CentOS, Solaris, Windows.

PROFESSIONAL EXPERIENCE

Hadoop Administrator

Confidential

Responsibilities:

  • Installing, Configuring, Maintaining and Monitoring Hortonworks Hadoop Cluster.
  • Very good knowledge on building scalable distributed data solutions using Hadoop.
  • Experience in installing all the Hadoop ecosystem components through Ambari and manually through command line interface.
  • Performed Major and Minor upgrades in production environment and followed standard Back up policies to make sure the high availability of cluster.
  • Importing and Exporting Data between different RDBMS systems such as MySQL, Oracle, Teradata, DB2 and HDFS using Sqoop.
  • Changing the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Implemented Centrify AD Integrated Kerberos on Horton Works HDP cluster.
  • Used Ganglia and Nagios for monitoring the Hadoop cluster around the clock.
  • Proficient in Monitoring Workload, Job Performance and Capacity Planning through Ambari.
  • Extensively worked on commissioning and decommissioning of cluster nodes, file system integrity checks and maintaining cluster data replication.
  • Experience in collecting metrics and monitoring Hadoop clusters using Grafana and Ambari Metrics.
  • Worked with storage team in adding storage to production Database, backup servers (SRDF/BCV) and DR servers.
  • Commissioning and Decommissioning of Data nodes in case of DataNode failure.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
  • Capacity and planning for Hadoop Clusters on AWS Cloud Infrastructure.
  • Explored Spark, Kafka, Storm along with other open source projects to create a realtime analytics framework.
  • Involved in developing some machine learning algorithms using Mahout for data mining for the data stored in HDFS.
  • Involved in implementing multiple Map Reduce programs in Java for Data Analysis.
  • Installed SOLR on the existing Hadoop cluster.
  • Worked with Systems Administrating team to setup new Hadoop Cluster and expand existing Hadoop clusters.
  • Responsible for Cluster Planning, Maintenance, Monitoring, Troubleshooting, Manage and review data backups.
  • Setting up Identity, Authentication through Kerberos and Authorization through Ranger.
  • Worked intact for integrating the LDAP server and active directory with the Ambari through command line interface.
  • Installed, configured and administrated a 6 node Elasticsearch Cluster.
  • Adding, Deleting and setting up permissions for new users to Hadoop environment.
  • Migrated data across clusters using DISTCP.
  • Worked with release management technologies such as Jenkins, github, gitlab and Ansible.
  • Setting up and managing High Availability of Name node using Quorum Manager to avoid single point of failures in large clusters.
  • Experience in data ingestion from Oracle, SAP Hana, Teradata, Datawarehouse into Hadoop ecosystem.
  • Cluster coordination services through ZooKeeper.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce to load data into HDFS.
  • Coordinated with Hortonworks support team through support portal to sort out the critical issues during upgrades.
  • Involved in data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud.
  • Installed Oozie workflow engine to schedule multiple MapReduce, Hive and pig jobs.
  • Analyzed the data by performing Hive queries (HiveQL) to generate reports and running Pig scripts (Pig Latin) to study customer behavior.
  • Experience in managing and reviewing Hadoop log files.
  • Developed scripts for tracking the changes in file permissions of the files and directories through audit logs in HDFS.
  • Configured Apache NiFi on the existing Hadoop cluster.
  • Worked on smoke tests of each service and client upon installation and configuration.
  • Involved in loading data from UNIX file system to HDFS.
  • Involved in developing multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Balancing HDFS manually to decrease network utilization and increase job performance.
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on HDFS and Local file system.
  • Worked in Devops model, Continuous Integration and Continuous Deployment (CICD), automated deployments using Jenkins and Ansible.
  • Debugging and troubleshooting the issues in development and Test environments.

Environment: Hortonworks, HDFS, Hive, NiFi, Pig, Sqoop, Spark, HBase, Ambari, ZooKeeper, Oozie, Grafana, Ganglia, Nagios, Shell Scripting, Linux/Unix, MySQL, Oracle, Mahout, HiveQL, Knox, Ansible, Storm, Centrify, AWS, Elasticsearch, MapReduce, Kafka, Java, Yarn, SOLR, Kerberos, Ranger.

Hadoop Administrator

Confidential - IL

Responsibilities:

  • Installing, Configuring and Maintaining Apache Hadoop and Cloudera Hadoop clusters.
  • Installing Hadoop components such as HDFS, Hive, Pig, HBase, ZooKeeper and Sqoop through Cloudera manager.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Worked on Providing User support and application support through remedy ticket management system on Hadoop Infrastructure.
  • Involved in gathering business requirements and analysis of business use cases.
  • Installed and configured Hadoop cluster in Development, Testing and Production environments.
  • Performed both major and minor upgrades to the existing CDH cluster.
  • Responsible for monitoring and supporting Development activities.
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Developed scripts to delete the empty hive tables existing in the Hadoop file system.
  • Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using Map Reduce, HIVE, SQOOP and Pig Latin.
  • Implemented Name Node backup using NFS. This was done for High availability.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Wrote shell scripts for rolling day-to-day processes and it automated using crontab.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Worked on setting up high availability for major production cluster and designed automatic failover.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Created users and maintained user groups and ensured to provide access to the required Hadoop clients.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
  • Used Ganglia and Nagios for monitoring the cluster around the clock.
  • Developed multiple jobs to smoke test the services upon upgrades and installation.
  • Developed HIVE queries and UDFs to analyze the data in HDFS.
  • Performed Analyzing/Transforming data with Hive and Pig.
  • Configured log4j.properties files to define the storage policy of all the service logs.

Environment: Apache Hadoop, Cloudera Distribution, Yarn, Spark, Sqoop, DB2, Hive, Nagios, Knox, AWS, bash, Pig, HBASE, RHEL, Ganglia, Kerberos, Oozie, Flume, MapReduce.

Linux Administrator

Confidential

Responsibilities:

  • Installing, Configuring and Administrating of Red Hat Linux servers and Solaris servers.
  • Performing Installation, Patching, Upgrading packages, OS through RPM and YUM server management.
  • Administration of RHEL which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Installation, configuration, administration of Solaris 10 on SPARC based servers using Jumpstart.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Installation of Solaris 10 on Sun Fire and Enterprise server using Jumpstart, custom configuration like installing packages, configuring services.
  • Troubleshooting Linux network, security related issues, capturing packets using tools such as iptables, firewall, TCP wrappers, NMAP.
  • Writing shell scripts for Automated Back- ups and Cron Jobs using bash.
  • Monitored server and application performance & tuning via various stat commands (vmstat, nfsstat, iostat etc.) and tuned I/O, memory, etc.
  • Experienced in Troubleshooting critical hardware and software issues and other day-to-day user trouble tickets.
  • Configured and installed Linux Servers on both virtual machine and bare metal Installation.
  • Creating physical volumes, volume groups, logical volumes.
  • Experience on various cloud computing products such as Amazon Web Services EC2 and Microsoft Azure.
  • Performed Troubleshooting NFS, NIS, Samba, DNS, DHCP, LDAP, MySQL and network problems.
  • Creating and maintaining user accounts, profiles, security, rights, disk space and process monitoring.
  • Maintained the user's accounts in NIS environment.
  • Provided support for both physical and virtual environments.

Environment: RHEL Servers, Solaris, NFS, NIS, Samba, bash, DNS, DHCP, LDAP, YUM, RPM, LVM’s, TCP/IP, RAID.

Linux Administrator

Confidential

Responsibilities:

  • Installation, Maintenance and Administration of Red Hat Enterprise Linux and Solaris.
  • Managing and Supporting User Accounts such as Troubleshooting User's Login, Home Directory related issues, Reset Password and Unlock User Accounts.
  • Performed automated installations of Operating System using kickstart for Red Hat Enterprise Linux 5/6.
  • Remote Monitoring and Management of server hardware.
  • Installing, Updating patches to the servers using Red Hat Satellite server.
  • Involved in Server sizing and identifying, recommending optimal server hardware based on User requirements.
  • Created File systems from local storage as well as NFS partitions for application portability.
  • Working on Volume management, Disk Management, software RAID solutions using VERITAS Volume manager & Solaris Volume Manager.
  • File system Tuning and growing using VERITAS File System (VxFS), coordinated with SAN Team for storage allocation and Disk Dynamic Multi path.
  • Installing, administering Red Hat using KVM based hypervisor.
  • RPM and YUM package installations, patch and other server management.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Installed, configured, and maintained MYSQL and Oracle 10g on Red Hat Linux.
  • Configuring multipath, adding SAN and creating physical volumes, volume groups, logical volumes.
  • Installing and configuring Apache and supporting them on Linux production servers.
  • Troubleshooting Linux network, security related issues, capturing packets using tools such as IP tables, firewall, TCP wrappers.
  • Worked on resolving production issues and documenting Root Cause Analysis and updating the tickets using Service Now.
  • Monitored server and application performance & tuning via various stat commands (vmstat, nfsstat, iostat etc.) and tuned I/O, memory etc.
  • Coordinated with the Storage, Network and Hardware teams on server provisions (HP Proliant DL385/585 G5, G6 servers, and HP BL 465C G5 blade centers).
  • Experienced in Troubleshooting critical hardware and software issues and other day-to-day user trouble tickets.

Environment: Red Hat Linux Servers, YUM, RPM, RAID, NFS, NIS, Samba, DNS, DHCP, LDAP, YUM, RPM, LVM’s.

Hire Now