Hadoop Admin Resume
5.00/5 (Submit Your Rating)
San Francisco, CA
SUMMARY:
- Overall 7 Years of professional IT experience which includes around 2+ years of hands on experience in Hadoop Administration using Cloud era (CDH) and Horton works (HDP) Distributions on large distributed clusters.
- Hands on experience in deploying and managing multi - node development, testing and production of Hadoop Cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, ZOOKEEPER, HBASE) using Cloudera Manager and Hortonworks Ambari.
- Hand on experience in Big Data Technologies/Framework like Hadoop, HDFS, YARN, MapReduce, HBase, Hive, Pig, Sqoop, NoSQL, Flume, Oozie.
- Experienced with deployments, maintenance and troubleshooting applications on Microsoft Azure Cloud infrastructure.
- Proficiency with the application servers like Web Sphere, WebLogic, JBOSS and Tomcat.
- Performed administrative tasks on Hadoop Clusters using Cloudera/Hortonworks.
- Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
- Experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
- Experience in administering Tableau and Green Plum databases instances in various environments.
- Experience in administration of Kafka and Flume streaming using Cloudera Distribution.
- Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
- Good experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL, and DB2.
- Responsible for Configuring, Managing & Administering overall VPCs, EC2, RDS, CloudFront, CloudWatch. S3, ELB and also providing applications support for deployment with Chef on AWS Cloud.
- Implemented OpenVPN solution to connect remote users to AWS VPC and on-premise DC, responsible for administering and maintaining it at all.
- Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
- Experience in managing the Hadoop MapR infrastructure with MCS.
- Good understanding in Deployment of Hadoop Clusters using Automated Puppet scripts
- Worked on NoSQL databases including Hbase, Cassandra and MongoDB.
- Designing and implementing security for Hadoop cluster with Kerberos secure authentication.
- Hands on experience on Nagios and Ganglia tool for cluster monitoring system.
- Strong experience in System Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning on Linux (RHEL) systems.
- Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring on Linux systems.
- Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
PROFESSIONAL EXPERIENCE:
Confidential, San Francisco, CA
Hadoop Admin
- Designed and implemented by configuring Topics in new Kafka cluster in all environment.
- Successfully secured the Kafka cluster with Kerberos.
- Administration & Monitoring Hadoop.
- Worked on Hadoop Upgradation from 4.5 to 5.2
- Monitor Hadoop cluster job performance and capacity planning
- Removing from monitoring of particular security group nodes in nagios in case of retirement
- Responsible for managing and scheduling jobs on Hadoop Cluster
- Replacement of Retired Hadoop slave nodes through AWS console and Nagios Repositories
- Performed dynamic updates of Hadoop Yarn and MapReduce memory settings
- Worked with DBA team to migrate Hive and Oozie meta store Database from MySQL to RDS
- Worked with fair and capacity schedulers, creating new queues, adding users to queue, Increase mapper and reducers capacity and also administer view and submit Mapreduce jobs
- Experience in Administration/Maintenance of source control management systems, such as GIT and GITHUB knowledge
- Hands on experience in installing and administrating CI tools like Jenkins
- Experience in integrating Shell scripts using Jenkins
- Installed and configured an automated tool Puppet that included the installation and configuration of the Puppet master, agent nodes and an admin control workstation.Working with Modules, Classes, Manifests in Puppet
- Experience in creating Docker images
- Used containerization technologies like Docker for building clusters for orchestrating containers deployment.
- Operations - Custom Shell scripts, VM and Environment management.
- Experience in working with Amazon EC2, S3, Glaciers
- Create multiple groups and set permission polices for various groups in AWS
- Experience in creating life cycle policies in AWS S3 for backups to Glaciers
- Experience in maintaining, executing, and scheduling build scripts to automate DEV/PROD builds.
- Configured Elastic Load Balancers with EC2 Auto scaling groups.
- Created monitors, alarms and notifications for EC2 hosts using Cloudwatch.
- Launching Amazon EC2 Cloud Instances using Amazon Images (Linux/Ubuntu) and configuring launched instances with respect to specific applicationsWorked with IAM service creating new IAM users & groups, defining roles and policies and Identity providers
- Experience in assigning MFA in AWS using IAM and s3 buckets
- Defined AWS Security Groups which acted as virtual firewalls that controlled the traffic allowed to reach one or more AWS EC2 instances.
- AmazonRoute53 to oversee DNS zones and furthermore give open DNS names to flexible load balancers IP.
- Using default and custom VPCs to create private cloud environments with public and private subnets
- Loaded data from Oracle, MS SQL Server, MySQL, Flat File database into HDFS, HIVE
- Fixed Namenode partition failed, fsimage not rotated, MR job failed with too many fetch failures and troubleshooting common Hadoop cluster issues
- Implemented manifest files in puppet for automated orchestration of Hadoop and Cassandra clusters
- Maintaining Github repositories for Configuration Management
- Configured distributed monitoring system Ganglia for Hadoop clusters
- Managing cluster coordination services through Zoo Keeper
- Configured and deployed Namenode High Availability Hadoop cluster with SSL and kerberoized
- Deal with the several services restart and killing the process with Pid to clear the alert
- Monitoring Log files of several services, clear files incase of Diskspace issues on share this nodes
- 24X7 production support for weekly schedule with Ops team
- Implemented real time log analytics pipeline using Confluent Kafka, storm, elastic search Logstash kibana, and greenplum.
- Maintaining the Elasticsearchcluster and Logstash nodes to process around 5TB of Data Daily from various sources like kafka, kubernetes, etc .
- Performed data transfer from SQL to HBase using Sqoop.
- Worked on installing and configuring of CDH 5.8, 5.9 and 5.10 Hadoop Cluster on AWS using Cloudera Director, Cloudera Manager.
- Managing, monitoring and troubleshooting Hadoop Cluster using Cloudera Manager. the flow of data from source to HDFS.
- Experience in job workflow scheduling and scheduling tools like Nifi.
- Ingested data into HDFS using Nifi with different processors, developed custom Input Adaptors
- Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI
- Used Cloudera Navigator for data governance: Audit and Linage.
- Configured Apache Sentry for fine-grained authorization and role-based access control of data in Hadoop.
- Created the AWS VPC network for the Installed Instances and configured the Security Groups and Elastic IP's accordingly
- Monitoring performance and tuning configuration of services in Hadoop Cluster.
- Imported the data from relational databases into HDFS using Sqoop.
- Deployed Spark Cluster and other services in AWS using console.
- Installed Kerberos secured kafka cluster with no encryption on Dev and Prod. Also set up Kafka ACL's into it
Confidential, St. Louis, Missouri
Hadoop Admin
- Created stored procedures in MySQL Server to perform result-oriented tasks
- Debugged and modified PL/SQL packages, procedures, and functions for resolving production issues daily, along with writing PL/SQL code from scratch for new requirements.
- Installed and configured Hadoop, MapReduce, and HDFS.
- Developed multiple MapReduce jobs using Java API for data cleaning and preprocessing.
- Importing and exporting data into HDFS and HIVE from an Oracle 11g database using Sqoop
- Responsible to manage data coming from different sources
- Monitoring the running MapReduce programs on the cluster
- Working on 4 Hadoop clusters for different teams, supporting 50+ users to use Hadoop platform, provide training to users to make Hadoop usability simple and updating them for best practices.
- Worked on ETL tool Informatica, Oracle Database and PL/SQL, Python and Shell Scripts.
- Experience with ETL working with Hive and Map-Reduce.
- Involved in database design, creating Tables, Views, Stored Procedures, Functions, Triggers and Indexes. Strong experience in Data Warehousing and ETL using Datastage.
- Implementing Hadoop Security on Hortonworks Cluster using Kerberos and Two-way SSL
- Experience with Hortonworks, Cloudera CDH4 and CDH5 distributions
- Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non-secured cluster to secured cluster.
- Installed Kerberos secured kafka cluster with no encryption on Dev and Prod. Also set up Kafka ACL's into it
- Successfully did set up a no authentication kafka listener in parallel with Kerberos (SASL) Listener. Also I tested non authenticated user (Anonymous user) in parallel with Kerberos user.
- Installed and configured Confluent Kafka in R&D line. Validated the installation with HDFS connector and Hive connectors.
- Contributed to building hands-on tutorials for the community to learn how to use Hortonworks Data Platform (powered by Hadoop) and Hortonworks Dataflow (powered by NiFi) covering categories such as Hello World, Real-World use cases, Operations.
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Managed 350+ Nodes CDH cluster with 4 petabytes of data using Cloudera Manager and Linux RedHat 6.5.
- Experienced with deployments, maintenance and troubleshooting applications on Microsoft Azure Cloud infrastructure.
- Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
- Implemented Azure APIM modules for public facing subscription based authentication implemented Circuit Breaker for system fatal errors
- Experience in creating and configuring Azure Virtual Networks (Vnets), subnets, DHCP address blocks, DNS settings, Security policies and routing.
- Created Web App Services and deployed Asp.Net applications through Microsoft Azure Web App services.
- Created various database objects like Tables, Views, Materialized Views, Triggers, Synonyms, Data base Links as per business requirements.
- Built Web interface using Python, HTML, SQL Server which gives approximate number of items from vendors depending on previous sales
- Documented software defects regarding program functionality and suggested actionable improvements to correct deficiencies.
Confidential, PALO ALTO, CA
Hadoop Admin/ Linux Administrator
- Installation and configuration of Linux for new build environment.
- Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
- Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions
- Experienced in Installation and configuration Cloudera CDH4 in testing environment.
- Resolved tickets submitted by users, P1 issues, troubleshoot the errors, resolving the errors.
- Validated web services manually and through groovy script automation using SOAP UI.
- Implementing End to End automation tests by consuming the APIs of different layers.
- Involved in using Postman tool to test SOA based architecture for testing SOAP services and REST API.
- Used Maven to build and run the Selenium automation framework.
- Framework used to send the automation reports over email.
- Validated web services manually and through groovy script automation using SOAP UI.
- Implementing End to End automation tests by consuming the APIs of different layers.
- Balancing HDFS manually to decrease network utilization and increase job performance.
- Responsible for building scalable distributed data solutions using Hadoop.
- Done major and minor upgrades to the Hadoop cluster.
- Upgraded the Cloudera Hadoop ecosystems in the cluster using Cloudera distribution packages.
- Use of Sqoop to Import and export data from HDFS to RDMS vice-versa.
- Done stress and performance testing, benchmark for the cluster.
- Commissioned and decommissioned the Data Nodes in the cluster in case of the problems.
- Debug and solve the major issues with Cloudera manager by interacting with the Cloudera team.
- Involved in estimation and setting-up Hadoop Cluster in Linux.
- Prepared PIG scripts to validate Time Series Rollup Algorithm.
- Responsible for support, troubleshooting of Map Reduce Jobs, Pig Jobs and maintaining Incremental Loads at daily, weekly and monthly basis.
- Implemented Oozie workflows for Map Reduce, Hive and Sqoop actions.
- Channelized Map Reduce outputs based on requirement using Practitioners
- Performed scheduled backup and necessary restoration.
- Build and maintain scalable data using the Hadoop ecosystem and other open source components like Hive and HBase.
Confidential
Linux/Unix Administrator
- Experience installing, upgrading and configuring RedHat Linux 4.x, 5.x, 6.x using Kickstart Servers and Interactive Installation
- Responsible for creating and managing user accounts, security, rights, disk space and process monitoring in Solaris, CentOS and Redhat Linux.
- Experience in writing Scripts in Bash for performing automation of various tasks.
- Experience in writing Shell scripts using bash for process automation of databases, applications, backup and scheduling to reduce both human intervention and man hours.
- Remote system administration via tools like SSH and Telnet
- Extensive use of crontab for job automation.
- Installed & Configured Selenium Web Driver, Test-NG, Maven tool and created Selenium automation scripts in java using Test-NG prior to next quarter release.
- Developed Python Scripts (automation scripts) for stability testing.
- Experience administering, installing, configuring and maintaining Linux
- Creates Linux Virtual Machines using VMware Virtual Center dministers VMware Infrastructure Client 3.5 and Vsphere 4.1
- Installs Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems
- Installing Red Hat Linux 5/6 using kickstart servers and interactive installation.
- Supporting infrastructure environment comprising of RHEL and Solaris.
- Installation, Configuration, and OS upgrades on RHEL 5.X/6.X/7.X, SUSE 11.X, 12.X.
- Implemented and administered VMware ESX 4.x 5.x and 6 for running the Windows, Centos, SUSE and Red Hat Linux Servers on development and test servers.
- Create, extend, reduce and administration of Logical Volume Manager (LVM) in RHEL environment.
- Responsible for large-scale Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.