Hadoop Admin Resume
4.00/5 (Submit Your Rating)
Austin, TX
SUMMARY:
- Overall 7 Years of professional IT experience which includes around 2+ years of hands on experience in Hadoop Administration using Cloud era (CDH) and Horton works (HDP) Distributions on large distributed clusters.
- Hands on Experience in Installing, Configuring and using Hadoop Eco System Components like HDFS, Hadoop Map Reduce, Yarn, Zookeeper, Sentry, Sqoop, Flume, Hive, HBase, Pig, Oozie.
- Site Reliability Engineering responsibilities for Kafka platform that scales 2 GB/Sec and 20 Million messages/sec.
- Significant experience & demonstrated proficiency in all aspects of database programming on Oracle SQL/PLSQL and/or related technologies that encapsulate SQL, including Cursors, Ref - cursors, Procedures, Functions and Packages, Oracle Supplied Packages, Collections, Partitioned Tables, Triggers, Table Indexing
- Experienced in Tuning SQL and PL/SQL code for better performance with large volume of data.
- Good knowledge in creating DDL, DML and transaction queries in Oracle and SQL Server Databases.
- Rich experience working with Data Stage ETL tool for the extraction, transformation and loading of data among different databases.
- Hands on experience on configuring a Hadoop cluster in a professional environment and Amazon Web Service (AWS) using an EC2 instance.
- Good working experience on Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Experience in Importing and Exporting Data between different Database Tables like MySQL, Oracle and HDFS using Sqoop.
- Had good working experience on Hadoop architecture, HDFS, Map Reduce and other components in the Cloudera - Hadoop eco system.
- Experience in writing scripts for Automation.
- Experience in Benchmarking, Backup and Disaster Recovery of Name node Metadata.
- Experience in performing minor and major Upgrades of Hadoop Cluster (Hortonworks Data Platform 2.0 to 2.1.)
- Experience with multiple Hadoop distribution s like Apache, Cloudera and Hortonworks.
- Experience in securing Hadoop clusters using Kerberos and Sentry.
- Experience with distributed computation tools such as Apache Spark Hadoop.
- Experience as Deployment Engineer and System Administrator on Linux (Centos, Ubuntu, Red Hat).
- Experience working with Deployment tools such as Puppet/Ansible.
- Well versed in installing, configuring and tuning Hadoop distributions: Cloudera, Hortonworks on Linux systems.
- Experience with Red hat Packet Manager packaging and RPM deployments.
- Experience with Nagios and writing plugins for Nagios to monitor Hadoop clusters.
- Experience in supporting users to debug their job failures
PROFESSIONAL EXPERIENCE:
Confidential, Austin, TX
Hadoop Admin
- Created stored procedures in MySQL Server to perform result-oriented tasks
- Hadoop installation, Configuration of multiple nodes using Cloudera platform.
- Worked on installing and configuring of CDH 5.8, 5.9 and 5.10 Hadoop Cluster on AWS using Cloudera Director, Cloudera Manager.
- Installed and Configured Hadoop monitoring and administrating tools like Cloudera Manager, Nagios and Ganglia.
- Responsible for large-scale Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.
- Helped the team to increase cluster size. The configuration for additional data nodes was managed using Puppet manifests.
- Architected and implemented automated server provisioning using puppet.
- Experience with managing and monitoring large scale mongo databases.
- Experience with Mongo Upgrades from 3.0 to 3.2 and 3.2 to 3.4.
- Having experience on Linux platform to manage the mongo DB
- Installed and configured a Hortonworks HDP 2.2 using Ambari and manually through command line. Cluster maintenance as well as creation and removal of nodes using tools like Ambari, Cloudera Manager Enterprise and other tools.
- Handling the installation and configuration of a Hadoop cluster.
- Designed and developed Datastage ETL Parallel jobs, Sequences, Datastage Routines and Containers.
- Translated high level requirements into ETL process.
- Tuned the developed ETL jobs for better performance.
- Performed dimensional data modeling to support data warehouse design and ETL development activities.
- As an ETL Tester responsible for the understanding the business requirements, creating test data and test case design
- Building and maintaining scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
- Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS.
- Successfully Generated consumer group lags from kafka using their API Kafka- Used for building real-time data pipelines between clusters.
- Designed and implemented by configuring Topics in new Kafka cluster in all environment.
- Successfully secured the Kafka cluster with Kerberos Implemented Kafka Security Features using SSL and without Kerberos. Further with more grain-fines Security I set up Kerberos to have users and groups this will enable more advanced security features.
- Involved in developer activities of installation and configuring Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Contributed to building hands-on tutorials for the community to learn how to setup Hortonworks Data Platform (powered by Hadoop) and Hortonworks Data flow (powered Nifi)
- Used Apache Nifi for ingestion of data from the IBM MQ's (Messages Queue)
- Implemented Nifi flow topologies to perform cleansing operations before moving data into HDFS.
- Started using Apache NiFi to copy the data from local file system to HDP
- Worked with Nifi for managing the flow of data from source to HDFS.
- Experience in job workflow scheduling and scheduling tools like Nifi.
- Ingested data into HDFS using Nifi with different processors, developed custom Input Adaptors
- Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI .
- Importing and exporting data into HDFS and Hive using Sqoop.
- Involved in Cluster Level Security, Security of perimeter (Authentication- Cloudera Manager, Active directory and Kerberos) Access (Authorization and permissions- Sentry) Visibility (Audit and Lineage - Navigator) Data ( Data Encryption at Rest)
- Handling the data exchange between HDFS and different web sources using Flume and Sqoop.
- Monitoring the data streaming between web sources and HDFS and functioning through monitoring tools.
- Close monitoring and analysis of the MapReduce job executions on cluster at task level.
- Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
- Install OS and administrated Hadoop stack with CDH5 (with YARN) Cloudera Distribution including configuration management, monitoring, debugging, and performance tuning Scripting Hadoop package installation and configuration to support fully-automated deployments.
- Day-to-day operational support of our Cloudera Hadoop clusters in lab and production, at multi-petabyte scale.
- Good troubleshooting skills on over all Hadoop stack components, ETL services and Hue, Rstudio which provides GUI for developers/business users for day-to-day activities.
- Create queues and allocated the clusters resources to provide the priority for jobs in hive.
- Implementing the SFTP for the projects to transfer data SCP from External servers to servers. Experienced in managing and reviewing log files. Involved in scheduling Oozie workflow engine to run multiple Hive, sqoop and pig jobs.
Confidential
Hadoop Admin
- Created stored procedures in MySQL Server to perform result-oriented tasks
- Debugged and modified PL/SQL packages, procedures, and functions for resolving production issues daily, along with writing PL/SQL code from scratch for new requirements.
- Working on 4 Hadoop clusters for different teams, supporting 50+ users to use Hadoop platform, provide training to users to make Hadoop usability simple and updating them for best practices.
- Worked on ETL tool Informatica, Oracle Database and PL/SQL, Python and Shell Scripts.
- Experience with ETL working with Hive and Map-Reduce.
- Involved in database design, creating Tables, Views, Stored Procedures, Functions, Triggers and Indexes. Strong experience in Data Warehousing and ETL using Datastage.
- Implementing Hadoop Security on Hortonworks Cluster using Kerberos and Two-way SSL
- Experience with Hortonworks, Cloudera CDH4 and CDH5 distributions
- Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non-secured cluster to secured cluster.
- Installed Kerberos secured kafka cluster with no encryption on Dev and Prod. Also set up Kafka ACL's into it
- Successfully did set up a no authentication kafka listener in parallel with Kerberos (SASL) Listener. Also I tested non authenticated user (Anonymous user) in parallel with Kerberos user.
- Installed and configured Confluent Kafka in R&D line. Validated the installation with HDFS connector and Hive connectors.
- Contributed to building hands-on tutorials for the community to learn how to use Hortonworks Data Platform (powered by Hadoop) and Hortonworks Dataflow (powered by NiFi) covering categories such as Hello World, Real-World use cases, Operations.
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Managed 350+ Nodes CDH cluster with 4 petabytes of data using Cloudera Manager and Linux RedHat 6.5.
- Experienced with deployments, maintenance and troubleshooting applications on Microsoft Azure Cloud infrastructure.
- Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
- Implemented Azure APIM modules for public facing subscription based authentication implemented Circuit Breaker for system fatal errors
- Experience in creating and configuring Azure Virtual Networks (Vnets), subnets, DHCP address blocks, DNS settings, Security policies and routing.
- Created Web App Services and deployed Asp.Net applications through Microsoft Azure Web App services.
- Creates Linux Virtual Machines using VMware Virtual Center.
- Responsible for software installation, configuration, software upgrades, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster on healthy on different Hadoop distributions (Hortonworks& Cloudera)
- Worked with application teams to install operating system, updates, patches, version upgrades as required.
- Created various database objects like Tables, Views, Materialized Views, Triggers, Synonyms, Data base Links as per business requirements.
- Built Web interface using Python, HTML, SQL Server which gives approximate number of items from vendors depending on previous sales
- Documented software defects regarding program functionality and suggested actionable improvements to correct deficiencies.
Confidential - CHICAGO, IL
Hadoop Admin/ Linux Administrator
- Installation and configuration of Linux for new build environment.
- Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
- Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions
- Experienced in Installation and configuration Cloudera CDH4 in testing environment.
- Resolved tickets submitted by users, P1 issues, troubleshoot the errors, resolving the errors.
- Balancing HDFS manually to decrease network utilization and increase job performance.
- Responsible for building scalable distributed data solutions using Hadoop.
- Done major and minor upgrades to the Hadoop cluster.
- Upgraded the Cloudera Hadoop ecosystems in the cluster using Cloudera distribution packages.
- Use of Sqoop to Import and export data from HDFS to RDMS vice-versa.
- Done stress and performance testing, benchmark for the cluster.
- Commissioned and decommissioned the Data Nodes in the cluster in case of the problems.
- Debug and solve the major issues with Cloudera manager by interacting with the Cloudera team.
- Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
- Monitoring the System activity, Performance, Resource utilization.
- Develop and optimize physical design of MySQL database systems.
- Deep understanding of monitoring and troubleshooting mission critical Linux machines.
- Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents.
- Extensive use of LVM, creating Volume Groups, Logical volumes.
- Performed Red Hat Package Manager (RPM) and YUM package installations, patch and other server management.
- Tested and Performed enterprise wide installation, configuration and support for hadoop using MapR Distribution.
- Setting up cluster and installing all the ecosystem components through MapR and manually through command line in Lab Cluster
- Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
- Involved in estimation and setting-up Hadoop Cluster in Linux.
- Prepared PIG scripts to validate Time Series Rollup Algorithm.
- Responsible for support, troubleshooting of Map Reduce Jobs, Pig Jobs and maintaining Incremental Loads at daily, weekly and monthly basis.
- Implemented Oozie workflows for Map Reduce, Hive and Sqoop actions.
- Channelized Map Reduce outputs based on requirement using Practitioners
- Performed scheduled backup and necessary restoration.
- Build and maintain scalable data using the Hadoop ecosystem and other open source components like Hive and HBase.
Confidential
Linux/Unix Administrator
- Experience installing, upgrading and configuring RedHat Linux 4.x, 5.x, 6.x using Kickstart Servers and Interactive Installation
- Responsible for creating and managing user accounts, security, rights, disk space and process monitoring in Solaris, CentOS and Redhat Linux
- Performed administration and monitored job processes using associated commands
- Manages systems routine backup, scheduling jobs and enabling cron jobs
- Maintaining and troubleshooting network connectivity
- Manages Patches configuration, version control, service pack and reviews connectivity issues regarding security problem
- Configures DNS, NFS, FTP, remote access, and security management, Server hardening
- Installs, upgrades and manages packages via RPM and YUM package management
- Logical Volume Management maintenance
- Experience administering, installing, configuring and maintaining Linux
- Creates Linux Virtual Machines using VMware Virtual Center dministers VMware Infrastructure Client 3.5 and Vsphere 4.1
- Installs Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems
- Installing Red Hat Linux 5/6 using kickstart servers and interactive installation.
- Supporting infrastructure environment comprising of RHEL and Solaris.
- Installation, Configuration, and OS upgrades on RHEL 5.X/6.X/7.X, SUSE 11.X, 12.X.
- Implemented and administered VMware ESX 4.x 5.x and 6 for running the Windows, Centos, SUSE and Red Hat Linux Servers on development and test servers.
- Create, extend, reduce and administration of Logical Volume Manager (LVM) in RHEL environment.
- Responsible for large-scale Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.
