We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

0/5 (Submit Your Rating)

TexaS

SUMMARY

  • 8+ years of professional experience including around 5 years of Linux Administrator and 3 plus years in Big Data analytics as Hadoop/Big Data Administrator.
  • Experience in architecting, designing, installing, configuring and managing of Apache Hadoop Clusters in MapR, Hortonworks & Cloudera Hadoop Distribution.
  • Experience in Configuring and maintaining HA of HDFS, YARN (yet another resource negotiator) Resource Manager, MapReduce, Hive, HBASE and Kafka.
  • Practical knowledge on functionalities of every Hadoop daemon, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
  • Experience in managing Hadoop infrastructure like commissioning, decommissioning, log rotation, rack topology implementation.
  • Experience in understanding and managing Hadoop Log Files.
  • Configuring the Zookeeper to coordinate the servers in Clusters and to maintain the Data Consistency.
  • Experience in understanding Hadoop multiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Experience in collecting the logs from log collector into HDFS using Flume.
  • Experience in Kafka multi node cluster setup.
  • Experience in setting up and managing the batch scheduler Oozie.
  • Extending Hive functionalities by writing custom UDFs.
  • Experience in integrating AD/LDAP users with Ambari and Ranger.
  • Good experience in implementing Kerberos & Ranger in Hadoop Ecosystem.
  • Experience in configuring policies in Ranger to provide the security for Hadoop services (Hive, HBase, Hdfs etc.)
  • Good Understanding of Rack Awareness in the Hadoop cluster.
  • Experience in using Monitoring tools like Cloudera manager and Ambari.
  • Experienced in adding/installation and configuring of new services and removal of them through Ambari.
  • Experienced in Ambari-alerts configuration for various components and managing the alerts.
  • Involved in migration of cluster to AWS.
  • Good understanding of Lambda functions.
  • Actively worked on enabling ssl for Hadoop services in EMR.
  • Analyzed and tuned performance for spark jobs in EMR, understanding the type and size of the input processed using specific instance types.
  • Good Understanding of data ingestion pipelines.
  • Set up Disks for MapR, Handled Disk Failures configured storage pools and worked with a Logical Volume Manager.
  • Managed Data with Volumes and worked with Snapshots, Mirror Volumes, Data protection and Scheduling in MapR.
  • Experience on UNIX commands and Shell Scripting.
  • Excellent interpersonal, communication, documentation and presentation skills.

TECHNICAL SKILLS

Hadoop /Big Data Technologies: HDFS, Map Reduce, YARN, HBase, Hive, Tez, Sqoop, Flume, Zookeeper, Spark, Storm, MongoDB, Pig, Hue, Ranger, Impala, Kafka, Oozie and Kerberos

Programming Languages: Shell Scripting, Java, Python

Monitoring Tools: Cloudera Manager, Ambari, Nagios, Ganglia

Databases: SQL Server, MYSQL, Cassandra

Web Technologies: HTML, XML, JSON, JavaScript

Operating Systems: Linux, Unix, Windows, Mac, CentOS

ETL Tools: Informatica Power Center 10.1/9.6.1/9.1/8. x, Power Exchange/Power Connect, Data Analyst, Metadata Manager, IDQ, Informatica MDM HUB 9.6.1/ 9.7, Business Glossary, B2B DT (Data Transformation), DX, MFT.

Other Concepts: OOPS, Data Structures, Algorithms, Software Engineering

Protocols: TCP/IP, FTP, SSH, Telnet, SCP,RSH,ARP and RARP

Configuration Tools: Puppet, IBM-TEM tool

PROFESSIONAL EXPERIENCE

Confidential, Texas

Hadoop Administrator

Responsibilities:

  • Cluster maintenance, Monitoring, Troubleshooting, Manage and review data backups, Manage & review log file Using Hortonworks and MapR.
  • Implemented and configured High Availability Hadoop Cluster using Hortonworks Distribution and MapR.
  • Experience working on Hadoop components like HDFS, YARN, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Storm, Flume, Ambari Infra, Ambari Metrics, Kafka.
  • Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
  • Experience in using Flume to stream data into HDFS - from various sources. Used Oozie workflow
  • Engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
  • Deployed Network file system for Name Node Metadata backup.
  • Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Back up of data from active cluster to a backup cluster using distcp.
  • Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Implemented an instance of Zookeeper for Kafka Brokers.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Worked in Kerberos, Active Directory/LDAP, Unix based File System.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Performance tuning of Jobs when Yarn jobs are slow, Tez job is slow, Slow data loading.
  • Managing the alerts on the Ambari page and take corrective and preventive actions.
  • HDFS Disk space management, Generate HDFS Disc Utilization report for Capacity planning.
  • User access management by setting up new users and providing them name Quotas and Space Quotas.

Environment: HDFS, YARN, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Storm, Flume, Ambari Infra, Ambari Metrics, Kafka, Ranger, Kerberos, Zookeeper

Confidential, New York

Hadoop Administrator

Responsibilities:

  • Installed and Configured Hadoop monitoring and administrating tools like Cloudera Manager, Nagios and Ganglia.
  • Participated in setting up Rack topology in the cluster.
  • Implemented and configured High Availability Hadoop Cluster (Quorum Based) using cloudera Distributed Hadoop.
  • Back up of data from active cluster to a backup cluster using distcp.
  • Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings
  • Implemented multiple high-performance MongoDB replica sets on EC2 with robust reliability
  • Removed the nodes for maintenance or malfunctioning nodes using decommissioning and added nodes using commissioning.
  • Hands on experience working on Hadoop ecosystem components like Yarn, Hadoop Map Reduce, HDFS, Zoo Keeper, Oozie, Hive, Sqoop, Pig, Flume.
  • Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
  • Experience in using Flume to stream data into HDFS - from various sources. Used Oozie workflow
  • Engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
  • Monitored services through Zookeeper
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Worked on analyzing Data with HIVE and PIG.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for Name Node Metadata backup.
  • Performed cluster backup using DISTCP, Cloudera manager BDR and parallel ingestion.
  • Designed and implemented scalable, secure cloud architecture based on Amazon Web Services.
  • Leveraged AWS cloud services such as EC2; auto-scaling; and VPC (Virtual Private Cloud) to build secure, highly scalable and flexible systems that handled expected and unexpected load bursts, and can quickly evolve during development iterations.

Environment: Hadoop Quorum Based, Oozie, Hive, Pig, Sqoop, MapReduce, HDFS, Cloudera, ZooKeeper, Nagios, Ganglia, Metadata, Flume, Yarn, Amazon Web Services, EC2, Horton works.

Confidential, Columbus, Ohio

Hadoop Administrator

Responsibilities:

  • Installed and Configured Hadoop monitoring and administrating tools like Cloudera Manager, Nagios and Ganglia.
  • Cluster maintenance, Monitoring, Troubleshooting, Manage and review data backups, Manage & review log file Using Hortonworks and MapR.
  • Implemented and configured High Availability Hadoop Cluster using Hortonworks Distribution and MapR.
  • Experience working on Hadoop components like HDFS, YARN, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Storm, Flume, Ambari Infra, Ambari Metrics, Kafka.
  • Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
  • Experience in using Flume to stream data into HDFS - from various sources. Used Oozie workflow
  • Engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
  • Deployed Network file system for Name Node Metadata backup.
  • Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Back up of data from active cluster to a backup cluster using distcp.
  • Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings
  • Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Implemented an instance of Zookeeper for Kafka Brokers.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Worked in Kerberos, Active Directory/LDAP, Unix based File System.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Performed both major and minor upgrades to the existing cluster and rolling back to the previous version.
  • Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Performance tuning of Jobs when Yarn jobs are slow, Tez job is slow, Slow data loading.
  • Managing the alerts on the Ambari page and take corrective and preventive actions.
  • HDFS Disk space management, Generate HDFS Disc Utilization report for Capacity planning.
  • User access management by setting up new users and providing them name Quotas and Space Quotas.

Environment: HDFS, YARN, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Storm, Flume, Ambari Infra, Ambari Metrics, Kafka, Ranger, Kerberos, Zookeeper

Confidential, Irvine CA

Hadoop/Bigdata Administrator

Responsibilities:

  • Handle the installation and configuration of a Hadoop cluster using Hortonworks Distribution.
  • Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
  • Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
  • Monitor the data streaming between web sources and HDFS.
  • Worked in Kerberos and how it interacts with Hadoop and LDAP.
  • Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
  • Worked in Kerberos, Active Directory/LDAP, Unix based File System.
  • Managed data in Amazon S3, implemented s3cmd to move data from clusters to S3.
  • Experience in Continuous Integration and expertise in Jenkins and Hudson tools.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
  • Experience in architecting, designing, installation, configuration and management of Apache Hadoop, Hortonworks Distribution.
  • Worked in Unix commands and Shell Scripting.
  • Worked in core competencies in Java, HTTP, XML and JSON.
  • Worked on spark it’s a fast and general - purpose clustering computing system.
  • Worked on Storm its distributed real-time computation system provides a set of general primitives for Commission and decommissions the Data nodes from cluster in case of problems.
  • Handle the Massively Parallel Processing (MPP) databases such as Microsoft PDW.
  • Experience in a Web-based Git repository hosting service, which offers all the distributed revision control and source code management (SCM) functionality of Git as well as adding its own features in Git Hub.
  • Experience in Hortonworks Distribution Platform (HDP) cluster installation and configuration.
  • Worked in statistics collection and table maintenance on MPP platforms.
  • Worked on Cloudera to analyze data present on top of HDFS.
  • Worked on large sets of structured, semi-structured and unstructured data.
  • Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map, reduce way.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Hortonworks, Flume, HBase, ZooKeeper, CDH3, MongoDB, Oracle, NoSQL and Unix/Linux.

Confidential

Linux Administrator

Responsibilities:

  • Installing and upgrading OE & Red hat Linux and Solaris 8/ & SPARC on Servers like HP DL 380 G3, 4 and 5 & Dell Power Edge servers.
  • Experience in LDOM’s and Creating sparse root and whole root zones and administered the zones for Web, Application and Database servers and worked on SMF on Solaris 10.
  • Experience working with HP LVM and Red hat LVM.
  • Experience in implementing P2P and P2V migrations.
  • Involved in Installing and configuring Centos & SUSE 11 & 12 servers on HP x86 servers.
  • Implemented HA using Red Hat Cluster and VERITAS Cluster Server 5.0 for Web Logic agent.
  • Managing DNS, NIS servers and troubleshooting the servers.
  • Troubleshooting application issues on Apache web servers and database servers running on Linux and Solaris.
  • Experience in migrating Oracle, MYSQL data using Double take products.
  • Used Sun Volume Manager for Solaris and LVM on Linux & Solaris to create volumes with layouts like RAID 1, 5, 10, 51.
  • Performed performance analysis using tools like prstat, mpstat, iostat, sar, vmstat, truss, Dtrace.
  • Experience working on LDAP user accounts and configuring Ldap on client machines.
  • Upgraded Clear-Case from 4.2 to 6.x running on Linux (Centos &Red hat)
  • Worked on patch management tools like Sun Update Manager.
  • Experience supporting middle ware servers running Apache, Tomcat and Java applications.
  • Worked on day-to-day administration tasks and resolve tickets using Remedy.
  • Used HP Service center and change management system for ticketing.
  • Worked on the administration of the Web Logic 9, JBoss 4.2.2 servers including installation and deployments.
  • Worked on F5 load balancers to load balance and reverse proxy Web Logic Servers.

Environment: Solaris 8/9/10, Veritas Volume Manager, web servers, LDAP directory, Active Directory, BEA Web logic servers, SAN Switches, Apache, Tomcat servers, Web Sphere application server.

Confidential

Linux/Systems Administrator

Responsibilities:

  • Installing, configuring and updating Solaris 7, 8, Red Hat 7.x, 8, 9, Windows NT/2000 Systems using media and Jumpstart and Kick-start.
  • Installing and configuring Windows Active directory server 2000 and Citrix Servers.
  • Published and administered applications via Citrix Meta Frame.
  • Creating System Disk Partition, mirroring root disk drive, configuring device groups in UNIX and Linux environment.
  • Working with VERITAS Volume Manager 3.5 and Logical Volume Manager for file system management, data backup and recovery.
  • Implementing backup solution using Dell T120 autoloader and CA Arc Server 7.0
  • Installed and Configured SSH Gate for Remote and Secured Connection.
  • Configuration of DHCP, DNS, NFS and auto mounter.
  • Creating, troubleshooting and mounting NFS File systems on different OS platforms.
  • Installing, Configuring and Troubleshooting various software’s like Windd, Citrix - Clarify, Rave, VPN, SSH Gate, Visio 2000, Star Application, Lotus Notes, Mail clients, Business Objects, Oracle, Microsoft Project.
  • Working 24/7 on call for application and system support.
  • Experience in working and supported SIBES database running on Linux Servers.

Environment: HP ProLiant servers, SUN Servers (6500, 4500, 420, Ultra 2 Servers), Solaris 7/8, Veritas Net Backup, Veritas Volume Manager, Samba, NFS, NIS, LVM, Linux, Shell Programming.

We'd love your feedback!