We provide IT Staff Augmentation Services!

Hadoop Admin Resume

3.00/5 (Submit Your Rating)

NJ

SUMMARY:

  • Overall 7 Years of professional IT experience which includes around 2+ years of hands on experience in Hadoop Administration using Cloud era (CDH) and Horton works (HDP) Distributions on large distributed clusters.
  • Hands on experience in deploying and managing multi - node development, testing and production of Hadoop Cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, ZOOKEEPER, HBASE) using Cloudera Manager and Hortonworks Ambari.
  • Hand on experience in Big Data Technologies/Framework like Hadoop, HDFS, YARN, MapReduce, HBase, Hive, Pig, Sqoop, NoSQL, Flume, Oozie.
  • Experienced with deployments, maintenance and troubleshooting applications on Microsoft Azure Cloud infrastructure.
  • Proficiency with the application servers like Web Sphere, WebLogic, JBOSS and Tomcat.
  • Performed administrative tasks on Hadoop Clusters using Cloudera/Hortonworks.
  • Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
  • Experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
  • Experience in administering Tableau and Green Plum databases instances in various environments.
  • Experience in administration of Kafka and Flume streaming using Cloudera Distribution.
  • Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
  • Good experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL, and DB2.
  • Responsible for Configuring, Managing & Administering overall VPCs, EC2, RDS, CloudFront, CloudWatch. S3, ELB and also providing applications support for deployment with Chef on AWS Cloud.
  • Implemented OpenVPN solution to connect remote users to AWS VPC and on-premise DC, responsible for administering and maintaining it at all.
  • Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
  • Experience in managing the Hadoop MapR infrastructure with MCS.
  • Good understanding in Deployment of Hadoop Clusters using Automated Puppet scripts
  • Worked on NoSQL databases including Hbase, Cassandra and MongoDB.
  • Designing and implementing security for Hadoop cluster with Kerberos secure authentication.
  • Hands on experience on Nagios and Ganglia tool for cluster monitoring system.
  • Strong experience in System Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning on Linux (RHEL) systems.
  • Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring on Linux systems.
  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.

WORK EXPERIENCE:

Confidential, NJ

Hadoop Admin

  • Successfully upgraded Hortonworks Hadoop distribution stack from 2.3.4 to 2.5.
  • f the cluster to the meet the needs of analysis whether I/O bound or CPU bound
  • Created stored procedures in MySQL Server to perform result-oriented tasks
  • Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to HDFS, HBase and Hive by integrating with Storm.
  • Building and maintaining scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Hadoop installation, Configuration of multiple nodes using Cloudera platform.
  • Worked on installing and configuring of CDH 5.8, 5.9 and 5.10 Hadoop Cluster on AWS using Cloudera Director, Cloudera Manager.
  • Installed and Configured Hadoop monitoring and administrating tools like Cloudera Manager, Nagios and Ganglia.
  • Responsible for cluster MapReduce maintenance tasks commissioning and decommissioning task trackers and MapReduce jobs.
  • Good understanding and related experience with Hadoop stack - internals, Hive, Pig and MapReduce, involved in defining job flows.
  • Upgrading Cloudera Manager from 5.8 to the latest version and CDH stack verison 5.8 to the latest stack
  • Building a whole data center which involved racking, stacking bare metal setting up servers as well as racks.
  • Performed ambari and HDP stack upgrades on Hadoop clusters
  • Stack driver Configuration to monitor all projects and Resources to ensure 99.99uptime in GCP.
  • Creating Looker cluster in GCP and monitoring using stackdriver.
  • Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Wrote MapReduce jobs to discover trends in data usage by users.
  • Responsible for large-scale Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.
  • Helped the team to increase cluster size. The configuration for additional data nodes was managed using Puppet manifests.
  • Architected and implemented automated server provisioning using puppet.
  • Experience with managing and monitoring large scale mongo databases.
  • Experience with Mongo Upgrades from 3.0 to 3.2 and 3.2 to 3.4.
  • Having experience on Linux platform to manage the mongo DB
  • Installed and configured a Hortonworks HDP 2.2 using Ambari and manually through command line. Cluster maintenance as well as creation and removal of nodes using tools like Ambari, Cloudera Manager Enterprise and other tools.
  • Handling the installation and configuration of a Hadoop cluster.
  • Designed and developed Datastage ETL Parallel jobs, Sequences, Datastage Routines and Containers.
  • As an ETL Tester responsible for the understanding the business requirements, creating test data and test case design
  • Building and maintaining scalable data pipelines using the Hadoop ecosystem and other open source coponents like Hive and HBase.
  • Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS.
  • Successfully Generated consumer group lags from kafka using their API Kafka- Used for building real-time data pipelines between clusters.
  • Designed and implemented by configuring Topics in new Kafka cluster in all environment.
  • Successfully secured the Kafka cluster with Kerberos Implemented Kafka Security Features using SSL and without Kerberos. Further with more grain-fines Security I set up Kerberos to have users and groups this will enable more advanced security features.
  • Involved in developer activities of installation and configuring Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Contributed to building hands-on tutorials for the community to learn how to setup Hortonworks Data Platform (powered by Hadoop) and Hortonworks Data flow (powered Nifi)
  • Used Apache Nifi for ingestion of data from the IBM MQ's (Messages Queue)
  • Implemented Nifi flow topologies to perform cleansing operations before moving data into HDFS.
  • Started using Apache NiFi to copy the data from local file system to HDP
  • Worked with Nifi for managing the flow of data from source to HDFS.
  • Experience in job workflow scheduling and scheduling tools like Nifi.
  • Ingested data into HDFS using Nifi with different processors, developed custom Input Adaptors
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI .
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in Cluster Level Security, Security of perimeter (Authentication- Cloudera Manager, Active directory and Kerberos) Access (Authorization and permissions- Sentry) Visibility (Audit and Lineage - Navigator) Data ( Data Encryption at Rest)
  • Handling the data exchange between HDFS and different web sources using Flume and Sqoop.
  • Monitoring the data streaming between web sources and HDFS and functioning through monitoring tools.Close monitoring and analysis of the MapReduce job executions on cluster at task level.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.

Confidential

St. Louis, Missouri

Hadoop Admin

  • Created stored procedures in MySQL Server to perform result-oriented tasks
  • Debugged and modified PL/SQL packages, procedures, and functions for resolving production issues daily, along with writing PL/SQL code from scratch for new requirements.
  • Installed and configured Hadoop, MapReduce, and HDFS.
  • Developed multiple MapReduce jobs using Java API for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and HIVE from an Oracle 11g database using Sqoop
  • Responsible to manage data coming from different sources
  • Monitoring the running MapReduce programs on the cluster
  • Working on 4 Hadoop clusters for different teams, supporting 50+ users to use Hadoop platform, provide training to users to make Hadoop usability simple and updating them for best practices.
  • Worked on ETL tool Informatica, Oracle Database and PL/SQL, Python and Shell Scripts.
  • Experience with ETL working with Hive and Map-Reduce.
  • Involved in database design, creating Tables, Views, Stored Procedures, Functions, Triggers and Indexes. Strong experience in Data Warehousing and ETL using Datastage.
  • Implementing Hadoop Security on Hortonworks Cluster using Kerberos and Two-way SSL
  • Experience with Hortonworks, Cloudera CDH4 and CDH5 distributions
  • Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non-secured cluster to secured cluster.
  • Installed Kerberos secured kafka cluster with no encryption on Dev and Prod. Also set up Kafka ACL's into it
  • Successfully did set up a no authentication kafka listener in parallel with Kerberos (SASL) Listener. Also I tested non authenticated user (Anonymous user) in parallel with Kerberos user.
  • Installed and configured Confluent Kafka in R&D line. Validated the installation with HDFS connector and Hive connectors.
  • Contributed to building hands-on tutorials for the community to learn how to use Hortonworks Data Platform (powered by Hadoop) and Hortonworks Dataflow (powered by NiFi) covering categories such as Hello World, Real-World use cases, Operations.
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Managed 350+ Nodes CDH cluster with 4 petabytes of data using Cloudera Manager and Linux RedHat 6.5.
  • Experienced with deployments, maintenance and troubleshooting applications on Microsoft Azure Cloud infrastructure.
  • Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
  • Implemented Azure APIM modules for public facing subscription based authentication implemented Circuit Breaker for system fatal errors
  • Experience in creating and configuring Azure Virtual Networks (Vnets), subnets, DHCP address blocks, DNS settings, Security policies and routing.
  • Created Web App Services and deployed Asp.Net applications through Microsoft Azure Web App services.
  • Creates Linux Virtual Machines using VMware Virtual Center.
  • Responsible for software installation, configuration, software upgrades, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster on healthy on different Hadoop distributions (Hortonworks& Cloudera)
  • Worked with application teams to install operating system, updates, patches, version upgrades as required.
  • Created various database objects like Tables, Views, Materialized Views, Triggers, Synonyms, Data base Links as per business requirements.
  • Built Web interface using Python, HTML, SQL Server which gives approximate number of items from vendors depending on previous sales
  • Documented software defects regarding program functionality and suggested actionable improvements to correct deficiencies.

Confidential, PALO ALTO, CA

Hadoop Admin/ Linux Administrator

  • Installation and configuration of Linux for new build environment.
  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions
  • Experienced in Installation and configuration Cloudera CDH4 in testing environment.
  • Resolved tickets submitted by users, P1 issues, troubleshoot the errors, resolving the errors.
  • Validated web services manually and through groovy script automation using SOAP UI.
  • Implementing End to End automation tests by consuming the APIs of different layers.
  • Involved in using Postman tool to test SOA based architecture for testing SOAP services and REST API.
  • Used Maven to build and run the Selenium automation framework.
  • Framework used to send the automation reports over email.
  • Validated web services manually and through groovy script automation using SOAP UI.
  • Implementing End to End automation tests by consuming the APIs of different layers.
  • Balancing HDFS manually to decrease network utilization and increase job performance.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Done major and minor upgrades to the Hadoop cluster.
  • Upgraded the Cloudera Hadoop ecosystems in the cluster using Cloudera distribution packages.
  • Use of Sqoop to Import and export data from HDFS to RDMS vice-versa.
  • Done stress and performance testing, benchmark for the cluster.
  • Commissioned and decommissioned the Data Nodes in the cluster in case of the problems.
  • Debug and solve the major issues with Cloudera manager by interacting with the Cloudera team.
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Monitoring the System activity, Performance, Resource utilization.
  • Develop and optimize physical design of MySQL database systems.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design
  • Involved in estimation and setting-up Hadoop Cluster in Linux.
  • Prepared PIG scripts to validate Time Series Rollup Algorithm.
  • Responsible for support, troubleshooting of Map Reduce Jobs, Pig Jobs and maintaining Incremental Loads at daily, weekly and monthly basis.
  • Implemented Oozie workflows for Map Reduce, Hive and Sqoop actions.
  • Channelized Map Reduce outputs based on requirement using Practitioners
  • Performed scheduled backup and necessary restoration.
  • Build and maintain scalable data using the Hadoop ecosystem and other open source components like Hive and HBase.

Confidential

Linux/Unix Administrator

  • Experience installing, upgrading and configuring RedHat Linux 4.x, 5.x, 6.x using Kickstart Servers and Interactive Installation
  • Responsible for creating and managing user accounts, security, rights, disk space and process monitoring in Solaris, CentOS and Redhat Linux.
  • Experience in writing Scripts in Bash for performing automation of various tasks.
  • Experience in writing Shell scripts using bash for process automation of databases, applications, backup and scheduling to reduce both human intervention and man hours.
  • Remote system administration via tools like SSH and Telnet
  • Extensive use of crontab for job automation.
  • Installed & Configured Selenium Web Driver, Test-NG, Maven tool and created Selenium automation scripts in java using Test-NG prior to next quarter release.
  • Developed Python Scripts (automation scripts) for stability testing.
  • Experience administering, installing, configuring and maintaining Linux
  • Creates Linux Virtual Machines using VMware Virtual Center dministers VMware Infrastructure Client 3.5 and Vsphere 4.1
  • Installs Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems
  • Installing Red Hat Linux 5/6 using kickstart servers and interactive installation.
  • Supporting infrastructure environment comprising of RHEL and Solaris.
  • Installation, Configuration, and OS upgrades on RHEL 5.X/6.X/7.X, SUSE 11.X, 12.X.
  • Implemented and administered VMware ESX 4.x 5.x and 6 for running the Windows, Centos, SUSE and Red Hat Linux Servers on development and test servers.
  • Create, extend, reduce and administration of Logical Volume Manager (LVM) in RHEL environment.
  • Responsible for large-scale Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.

We'd love your feedback!