We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

WI

PROFESSIONAL SUMMARY:

  • Over 7 years of IT experience which includes experience in Big Data Ecosystem and Linux/Unix administration.
  • Experience in Installing, Configuring, Supporting and Managing of Cloudera and Hortonworks Hadoop platforms.
  • Experience in Installing and Configuring components HDFS, MapReduce, TEZ, Hive, Yarn, Sqoop, Flume, Oozie, Pig, HBase, Scala, Spark, Mahout, Knox, Kafka in Hadoop ecosystem.
  • Worked with Ambari and Cloudera Manager for Monitoring and Administrating the Hadoop Multi Node Cluster.
  • Expert in Administrating and Maintaining Hortonworks Hadoop clusters across all Production, UAT, Development and DR (Disaster Recovery) environments.
  • Experience in Administrating the Linux/Unix systems to deploy Hadoop Clusters.
  • Expertise in Monitoring multiple Hadoop Clusters environments using Ambari Metrics and Grafana.
  • Configuring Name Node, Resource Manager and HiveServer2 High Availability and Cluster service co - ordination using ZooKeeper.
  • Performing Backup, Disaster Recovery, Root Cause Analysis for Hadoop and Troubleshooting Hadoop Cluster issues.
  • Importing and Exporting Data between Relational Database Systems(RDBMS) and HDFS using Sqoop.
  • Configuring Flume for streaming data into HDFS Eco Systems.
  • Hands on experience in Configuring Kerberos for Authentication.
  • Expert in Configuring Authorization using Ranger for HDFS & Hive and defining Ranger policies.
  • Good knowledge in Benchmarking Hadoop Cluster using Hadoop benchmark techniques.
  • Experience in maintaining and monitoring the kerberized cluster.
  • Proficient in performing Minor and Major Upgrades, Commissioning and Decommissioning of data nodes on Hadoop cluster.
  • Experience in integrating Hadoop Cluster components with the LDAP, Active Directory and enabling SSL for Hadoop Cluster Components.
  • Experience in writing Hive queries using Hive Query Language HiveQL.
  • Strong working knowledge in writing queries in SQL.
  • Experience in monitoring the kerberized Hadoop cluster using Grafana, Nagios, Ganglia.
  • Working knowledge with NoSQL databases such as HBase, MongoDB, Cassandra.
  • Working knowledge in installing and configuring complete Hadoop eco system on AWS Cloud EC2 instance.
  • Experience in Installation, Configuration, Backup, Recovery, Maintenance, Support of Sun Solaris and Linux.
  • Working knowledge in configuring Apache NiFi on kerberized cluster.
  • Experience in building Apache Kafka cluster and integrating with Apache Storm for real time data analysis.
  • Working knowledge in analyzing large data sets using Hive, Pig Latin, HBase and Custom MapReduce programs in Java.
  • Installing, Upgrading and Configuring Red Hat Linux using Kickstart Servers and Interactive Installation.
  • Very good knowledge on Centrify. Knowledge on CI/CD tools such as Puppet, Chef, Jenkins, Ansible.
  • Extensive knowledge on Hadoop Application Frameworks like Tez and Spark.
  • Experience with System Integration, Capacity Planning, Performance Tuning, System Monitoring, System Security and Load Balancing.
  • Excellent knowledge in YUM and RPM Package Administration for Installing, Upgrading and Checking dependencies.
  • Working knowledge on AWS cloud and familiar with Azure cloud.
  • Familiar with data analytics tools such as Cognos and Tableau.
  • Very good knowledge on java programming. Familiar with python.
  • Knowledge on Sudoers configuration and administration on both Solaris and Red hat Linux Environment.

TECHNICAL SKILLS:

Hadoop Distribution: Hortonworks, Cloudera, MapR.

Hadoop Ecosystem: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, RabbitMQ, ZooKeeper, HBase, Spark, Hue, Apache SOLR, Knox, Kafka, Mahout, Apache NiFi, Scala.

Configuration Management: Ambari, Cloudera Manager.

Security: Kerberos, Ranger, Centrify.

Languages: Java, SQL, Shell Scripting, bash, HiveQL, HTML.

Databases: MySQL, SQL Server, DB2, MS-SQL, Oracle.

NoSQL Databases: HBase, MongoDB, Cassandra.

Monitoring Tools: Grafana, Ganglia, Nagios.

Package Management: RPM, YUM.

Networking and Protocols: NFS, HTTP, FTP, NIS, LDAP, TCP/IP, DNS.

Other Relevant Tools: JIRA, Pivotal Tracker.

Operating Systems: RHEL, Ubuntu, CentOS, Solaris, Windows.

WORK EXPERIENCE:

Hadoop Administrator

Confidential, WI

Responsibilities:

  • Installing, Configuring, Maintaining and Monitoring Hortonworks Hadoop Cluster.
  • Very good knowledge on building scalable distributed data solutions using Hadoop.
  • Experience in installing all the Hadoop ecosystem components through Ambari and manually through command line interface.
  • Performed Major and Minor upgrades in production environment and followed standard Back up policies to make sure the high availability of cluster.
  • Importing and Exporting Data between different RDBMS systems such as MySQL, Oracle, Teradata, DB2 and HDFS using Sqoop.
  • Changing the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Implemented Centrify AD Integrated Kerberos on Horton Works HDP cluster.
  • Used Ganglia and Nagios for monitoring the Hadoop cluster around the clock.
  • Proficient in Monitoring Workload, Job Performance and Capacity Planning through Ambari.
  • Extensively worked on commissioning and decommissioning of cluster nodes, file system integrity checks and maintaining cluster data replication.
  • Experience in collecting metrics and monitoring Hadoop clusters using Grafana and Ambari Metrics.
  • Worked with storage team in adding storage to production Database, backup servers (SRDF/BCV) and DR servers.
  • Commissioning and Decommissioning of Data nodes in case of DataNode failure.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
  • Capacity and planning for Hadoop Clusters on AWS Cloud Infrastructure.
  • Explored Spark, Kafka, Storm along with other open source projects to create a realtime analytics framework.
  • Developed Hive scripts on Avro and parquet file formats.
  • Involved in developing some machine learning algorithms using Mahout for data mining for the data stored in HDFS.
  • Involved in implementing multiple Map Reduce programs in Java for Data Analysis.
  • Installed SOLR on the existing Hadoop cluster.
  • Worked with Systems Administrating team to setup new Hadoop Cluster and expand existing Hadoop clusters.
  • Responsible for Cluster Planning, Maintenance, Monitoring, Troubleshooting, Manage and review data backups.
  • Setting up Identity, Authentication through Kerberos and Authorization through Ranger.
  • Worked intact for integrating the LDAP server and active directory with the Ambari through command line interface.
  • Developing Spark Programs for Batch and Real Time Processing to process incoming streams of data from Kafka sources and transform it into as DataFrames and load those DataFrames into Hive and HDFS.
  • Installed, configured and administrated a 6 node Elasticsearch Cluster.
  • Adding, Deleting and setting up permissions for new users to Hadoop environment.
  • Migrated data across clusters using DISTCP.
  • Worked with release management technologies such as Jenkins, github, gitlab and Ansible.
  • Setting up and managing High Availability of Name node using Quorum Manager to avoid single point of failures in large clusters.
  • Experience in data ingestion from Oracle, SAP Hana, Teradata, Datawarehouse into Hadoop ecosystem.
  • Cluster coordination services through ZooKeeper.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce to load data into HDFS.
  • Coordinated with Hortonworks support team through support portal to sort out the critical issues during upgrades.
  • Involved in data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud.
  • Installed Oozie workflow engine to schedule multiple MapReduce, Hive and pig jobs.
  • Analyzed the data by performing Hive queries (HiveQL) to generate reports and running Pig scripts (Pig Latin) to study customer behavior.
  • Experience in managing and reviewing Hadoop log files.
  • Developed scripts for tracking the changes in file permissions of the files and directories through audit logs in HDFS.
  • Configured Apache NiFi on the existing Hadoop cluster.
  • Worked on smoke tests of each service and client upon installation and configuration.
  • Involved in loading data from UNIX file system to HDFS.
  • Involved in developing multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Balancing HDFS manually to decrease network utilization and increase job performance.
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on HDFS and Local file system.
  • Worked in Devops model, Continuous Integration and Continuous Deployment (CICD), automated deployments using Jenkins and Ansible.
  • Debugging and troubleshooting the issues in development and Test environments.

Environment: Hortonworks, HDFS, Hive, NiFi, Pig, Sqoop, Spark, HBase, Ambari, ZooKeeper, Oozie, Grafana, Ganglia, Nagios, Shell Scripting, Linux/Unix, MySQL, Oracle, Mahout, HiveQL, Knox, Ansible, Storm, Centrify, AWS, Elasticsearch, MapReduce, Kafka, Java, Yarn, SOLR, Kerberos, Ranger.

Hadoop Administrator

Confidential, IL

Responsibilities:

  • Installing, Configuring and Maintaining Apache Hadoop and Cloudera Hadoop clusters.
  • Installing Hadoop components such as HDFS, Hive, Pig, HBase, ZooKeeper and Sqoop through Cloudera manager.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Worked on Providing User support and application support through remedy ticket management system on Hadoop Infrastructure.
  • Involved in gathering business requirements and analysis of business use cases.
  • Installed and configured Hadoop cluster in Development, Testing and Production environments.
  • Performed both major and minor upgrades to the existing CDH cluster.
  • Responsible for monitoring and supporting Development activities.
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Developed scripts to delete the empty hive tables existing in the Hadoop file system.
  • Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using Map Reduce, HIVE, SQOOP and Pig Latin.
  • Implemented Name Node backup using NFS. This was done for High availability.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Wrote shell scripts for rolling day-to-day processes and it automated using crontab.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Worked on setting up high availability for major production cluster and designed automatic failover.
  • Performance tune Hadoop cluster to achieve higher performance.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data.
  • Created users and maintained user groups and ensured to provide access to the required Hadoop clients.
  • Enabled Kerberos for Hadoop cluster Authentication and integrate with Active Directory for managing users and application groups.
  • Used Ganglia and Nagios for monitoring the cluster around the clock.
  • Developed multiple jobs to smoke test the services upon upgrades and installation.
  • Developed HIVE queries and UDFs to analyze the data in HDFS.
  • Performed Analyzing/Transforming data with Hive and Pig.
  • Configured log4j.properties files to define the storage policy of all the service logs.

Environment: Apache Hadoop, Cloudera Distribution, Yarn, Spark, Sqoop, DB2, Hive, Nagios, Knox, AWS, bash, Pig, HBASE, RHEL, Ganglia, Kerberos, Oozie, Flume, MapReduce.

Linux Administrator

Confidential

Responsibilities:

  • Installing, Configuring and Administrating of Red Hat Linux servers and Solaris servers.
  • Performing Installation, Patching, Upgrading packages, OS through RPM and YUM server management.
  • Administration of RHEL which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Installation, configuration, administration of Solaris 10 on SPARC based servers using Jumpstart.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Installation of Solaris 10 on Sun Fire and Enterprise server using Jumpstart, custom configuration like installing packages, configuring services.
  • Troubleshooting Linux network, security related issues, capturing packets using tools such as iptables, firewall, TCP wrappers, NMAP.
  • Writing shell scripts for Automated Back- ups and Cron Jobs using bash.
  • Monitored server and application performance & tuning via various stat commands (vmstat, nfsstat, iostat etc.) and tuned I/O, memory, etc.
  • Experienced in Troubleshooting critical hardware and software issues and other day-to-day user trouble tickets.
  • Configured and installed Linux Servers on both virtual machine and bare metal Installation.
  • Creating physical volumes, volume groups, logical volumes.
  • Experience on various cloud computing products such as Amazon Web Services EC2 and Microsoft Azure.
  • Performed Troubleshooting NFS, NIS, Samba, DNS, DHCP, LDAP, MySQL and network problems.
  • Creating and maintaining user accounts, profiles, security, rights, disk space and process monitoring.
  • Maintained the user's accounts in NIS environment.
  • Provided support for both physical and virtual environments.

Environment: RHEL Servers, Solaris, NFS, NIS, Samba, bash, DNS, DHCP, LDAP, YUM, RPM, LVM’s, TCP/IP, RAID.

Linux Administrator

Confidential

Responsibilities:

  • Installation, Maintenance and Administration of Red Hat Enterprise Linux and Solaris.
  • Managing and Supporting User Accounts such as Troubleshooting User's Login, Home Directory related issues, Reset Password and Unlock User Accounts.
  • Performed automated installations of Operating System using kickstart for Red Hat Enterprise Linux 5/6.
  • Remote Monitoring and Management of server hardware.
  • Installing, Updating patches to the servers using Red Hat Satellite server.
  • Involved in Server sizing and identifying, recommending optimal server hardware based on User requirements.
  • Created File systems from local storage as well as NFS partitions for application portability.
  • Working on Volume management, Disk Management, software RAID solutions using VERITAS Volume manager & Solaris Volume Manager.
  • File system Tuning and growing using VERITAS File System (VxFS), coordinated with SAN Team for storage allocation and Disk Dynamic Multi path.
  • Installing, administering Red Hat using KVM based hypervisor.
  • RPM and YUM package installations, patch and other server management.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Installed, configured, and maintained MYSQL and Oracle 10g on Red Hat Linux.
  • Configuring multipath, adding SAN and creating physical volumes, volume groups, logical volumes.
  • Installing and configuring Apache and supporting them on Linux production servers.
  • Troubleshooting Linux network, security related issues, capturing packets using tools such as IP tables, firewall, TCP wrappers.
  • Worked on resolving production issues and documenting Root Cause Analysis and updating the tickets using Service Now.
  • Monitored server and application performance & tuning via various stat commands (vmstat, nfsstat, iostat etc.) and tuned I/O, memory etc.
  • Coordinated with the Storage, Network and Hardware teams on server provisions (HP Proliant DL385/585 G5, G6 servers, and HP BL 465C G5 blade centers).
  • Experienced in Troubleshooting critical hardware and software issues and other day-to-day user trouble tickets.

Environment: Red Hat Linux Servers, YUM, RPM, RAID, NFS, NIS, Samba, DNS, DHCP, LDAP, YUM, RPM, LVM’s.

Hire Now