We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

San Francisco, CA

SUMMARY:

  • Total 8 years of IT experience and 3 Years' experience in the administration, modification, installation and maintenance of Hadoop on Linux RHEL operating system and Tableau.
  • Expertise in designing Big Data Systems with Cloud Service Models - IaaS, PaaS, SaaS and FaaS (Serverless)
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters.
  • In-depth knowledge of Hadoop Eco system - HDFS, Yarn, MapReduce, Hive, Hue, Sqoop, Flume, Kafka, Spark, Oozie, NiFi and Cassandra.
  • Experience on Ambari(Hortonworks) for management of Hadoop Ecosystem.
  • Expertise on setting up Hadoop security, data encryption and authorization using Kerberos, TLS/SSL and Apache Sentry respectively.
  • Extensive hands on administration with Hortonworks.
  • Practical knowledge on functionalities of every Hadoop daemon, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
  • Designed and provisioned Virtual Network at AWS using VPC, Subnets, Network ACLs, Internet Gateway, Route Tables, NAT Gateways
  • Strong knowledge on Hadoop HDFS architecture and Map-Reduce framework.
  • Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
  • Experience in performing backup and disaster recovery of Name node metadata and important sensitive data residing on cluster.
  • Architected and implemented automated server provisioning using puppet.
  • Experience in performing minor and major upgrades.
  • Experience in performing commissioning and decommissioning of data nodes on Hadoop cluster.
  • Strong knowledge in configuring Name Node High Availability and Name Node Federation.
  • Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, Sqoop automation.
  • Good working Knowledge in OOA and OOD using UML and designing use cases.
  • Good understanding of Scrum methodologies, Test Driven Development and continuous integration.
  • Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and using fast loaders and connectors Experience.
  • Expertise in database performance tuning & data modeling.
  • Experience in publishing Dashboards and presenting dashboards on web and desktop platforms to Tableau Server. Working Experience with system engineering team to plan and deploy Hadoop hardware and software environments.

TECHNICAL SKILLS:

Big Data Tools: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Flume, Oozie, Kafka, Horton work, Ambari, Knox, Phoniex, Impala, Storm.

Hadoop Distribution: Cloudera Distribution of Hadoop (CDH).

Operating Systems: UNIX, Linux, Windows XP, Windows Vista, Windows 2003 Server

Servers: Web logic server, WebSphere and JBoss.

Programming Languages: Java, Pl SQL, Shell Script, Perl, Python.

Tools: Interwoven Teamsite, GMS, BMC Remedy, Eclipse, Toad, SQL Server Management Studio, Jenkins, GitHub, Ranger Test NG, Junit.

Database: MySQL, NoSQL, Couchbase, InfluxDB, Teradata, HBase, MongoDB, Cassandra, Oracle.

Processes: Incident Management, Release Management, Change Management

PROFESSIONAL EXPERIENCE:

Hadoop Administrator

Confidential, San Francisco, CA

Responsibilities:

  • Installed Hadoop cluster and worked with big data analysis tools including hive
  • Implemented multiple nodes on Cloudera.
  • Imported data from Linux file system to HDFS.
  • Performed data transfer from SQL to HBase using Sqoop. worked with managing and reviewing Hadoop log file
  • Worked on evaluating, architecting, installation/setup of Hortonworks 2.1/1.8 Big Data ecosystem
  • Successfully upgraded Hortonworks Hadoop distribution stack from 2.3.4 to 2.5.
  • Currently working as admin in Hortonworks (HDP 2.2.4.2) distribution for 4 clusters ranges from POC to PROD.
  • Upgraded Elastic search from 5.3.0 to 5.3.2 following the rolling upgrade process and using ansible to deploy new packages in Prod Cluster.
  • Created and wrote shell scripts (kasha, Bash), Ruby, Python and PowerShell for setting up baselines, branching, merging, and automation processes across the environments using SCM tools like GIT, Subversion (SVN), Stash and TFS on Linux and windows platforms.
  • Worked on ETL tool Informatica, Oracle Database and PL/SQL, Python and Shell Scripts.
  • Experience in Python Scripting.
  • Orchestrated hundreds of Sqoop scripts, python scripts, Hive queries using Oozie workflows and sub- workflows
  • Adhered to change management and audit guidelines as well as security requirements for each customer's servers.
  • Oversee Installations, Monitoring and Managing change to servers (Overall Change Management for Servers)
  • Used HP Service center and change management system for ticketing.
  • Experienced with deployments, maintenance and troubleshooting applications on Microsoft Azure Cloud infrastructure.
  • Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
  • Implemented Azure APIM modules for public facing subscription based authentication implemented Circuit Breaker for system fatal errors
  • Experience in creating and configuring Azure Virtual Networks (Vnets), subnets, DHCP address blocks, DNS settings, Security policies and routing.
  • Created Web App Services and deployed Asp.Net applications through Microsoft Azure Web App services.
  • Used Change management and Incident management process following organization guidelines.
  • Used Apache Nifi for ingestion of data from the IBM MQ's (Messages Queue)
  • Implemented Nifi flow topologies to perform cleansing operations before moving data into HDFS.
  • Started using Apache NiFi to copy the data from local file system to HDP
  • Worked with Nifi for managing the flow of data from source to HDFS.
  • Experience in job workflow scheduling and scheduling tools like Nifi.
  • Ingested data into HDFS using Nifi with different processors, developed custom Input Adaptors
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI .
  • Set up Hortonworks Infrastructure from configuring clusters to Node
  • Installed Ambari server on the clouds
  • Setup security using Kerberos and AD on Hortonworks clusters
  • Extensive experience in cluster planning, installing, configuring and administrating Hadoop cluster for major Hadoop distributions like Cloudera and Hortonworks.
  • Installing, Upgrading and Managing Hadoop Cluster on Hortonworks
  • Hands on experience using Cloudera and Hortonworks Hadoop Distributions.
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI
  • Set up Hortonworks Infrastructure from configuring clusters to Node
  • Which includes Apache Hadoop HDFS, Pig, Hive and Sqoop.
  • Maintained and backed up meta-data
  • Used data integration tools like Sqoop
  • Improved a high-performance cache, leading to a greater stability and improved performance.
  • Responsible to Design & Develop the Business queries using hive.
  • Designed and developed automation test scripts using Python.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Experience working with Hive-QL on day to day basis to retrieve data from HDFS
  • Worked with developers to setup a full Hadoop system on AWD Environment: HDFS CDH3, CDH4, Hbase,
  • Designed and implemented by configuring Topics in new Kafka cluster in all environment.
  • Successfully secured the Kafka cluster with Kerberos
  • Implemented Kafka Security Features using SSL and without Kerberos. Further with more grain-fines Security I set up Kerberos to have users and groups this will enable more advanced security features.
  • Worked on installing and configuring of CDH 5.8, 5.9 and 5.10 Hadoop Cluster on AWS using Cloudera Director, Cloudera Manager.
  • Managing, monitoring and troubleshooting Hadoop Cluster using Cloudera Manager.
  • Installed and configured RHEL6 EC2 instances for Production, QA and Development environment.
  • Installed Kerberos for authentication of application and Hadoop service users.
  • Implemented Rack Awareness in Production Environment.
  • Worked on Disk space issues in Production Environment by monitoring how fast that space is filled, review what is being logged created a long-term fix for this issue (Minimize Info, Debug, Fatal Logs, and Audit Logs).

Hadoop Administrator

Confidential, Philadelphia, PA

Responsibilities:

  • Working on 4 Hadoop clusters for different teams, supporting 50+ users to use Hadoop platform, provide training to users to make Hadoop usability simple and updating them for best practices.
  • Implementing Hadoop Security on Hortonworks Cluster using Kerberos and Two-way SSL
  • Experience with Hortonworks, Cloudera CDH4 and CDH5 distributions
  • Installed Kerberos secured kafka cluster with no encryption on Dev and Prod. Also set up Kafka ACL's into it
  • Successfully did set up a no authentication kafka listener in parallel with Kerberos (SASL) Listener. Also I tested non authenticated user (Anonymous user) in parallel with Kerberos user.
  • Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non-secured cluster to secured cluster.
  • Contributed to building hands-on tutorials for the community to learn how to use Hortonworks Data Platform (powered by Hadoop) and Hortonworks Dataflow (powered by NiFi) covering categories such as Hello World, Real-World use cases, Operations.
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Managed 350+ Nodes CDH cluster with 4 petabytes of data using Cloudera Manager and Linux RedHat 6.5.
  • Experienced with deployments, maintenance and troubleshooting applications on Microsoft Azure Cloud infrastructure.
  • Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
  • Implemented Azure APIM modules for public facing subscription based authentication implemented Circuit Breaker for system fatal errors
  • Experience in creating and configuring Azure Virtual Networks (Vnets), subnets, DHCP address blocks, DNS settings, Security policies and routing.
  • Created Web App Services and deployed Asp.Net applications through Microsoft Azure Web App services.
  • Creates Linux Virtual Machines using VMware Virtual Center.
  • Responsible for software installation, configuration, software upgrades, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster on healthy on different Hadoop distributions (Hortonworks& Cloudera)
  • Worked with application teams to install operating system, updates, patches, version upgrades as required

Hadoop Admin

Confidential, Indianapolis, IN

Responsibilities:

  • Cluster Administration, releases and upgrades Managed multiple Hadoop clusters with the highest capacity of 7 PB (400+ nodes) with PAM Enabled Worked on Hortonworks Distribution.
  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Using Hadoop cluster as a staging environment for the data from heterogeneous sources in data import process.
  • Worked hands on with ETL process. Handled importing data from various data sources, performed transformations.
  • Worked hands on with ETL process. Handled importing data from various data sources, performed transformations.
  • Extensive experience with Informatica (ETL Tool) for Data Extraction, Transformation and Loading.
  • Extensive experience in building Data Warehouses/Data Marts using ETL tools Informatica Power Center (9.0/8.x/7.x).
  • Experience in developing Unix Shell Scripts for automation of ETL process.
  • Configured High Availability on the name node for the Hadoop cluster - part of the disaster recovery roadmap.
  • Configured Ganglia and Nagios to monitor the cluster and on-call with EOC for support.
  • Involved working on Cloud architecture.
  • Performed both Major and Minor upgrades to the existing cluster and also rolling back to the previous version.
  • Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Maintained, audited and built new clusters for testing purposes using the AMBARI, HORTONWORKS.
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI
  • Set up Hortonworks Infrastructure from configuring clusters to Node
  • Installed Ambari server on the clouds
  • Setup security using Kerberos and AD on Hortonworks clusters
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log Data from Many different sources to the HDFS.
  • Upgraded from HDP 2.2 to HDP 2.3 Manually in Software patches and upgrades.
  • Scripting Hadoop package installation and configuration to support fully automated deployments.
  • Configuring Rack Awareness on HDP.
  • Adding new Nodes to an existing cluster, recovering from a Name Node failure.
  • Instrumental in building scalable distributed data solutions using Hadoop eco-system.
  • Adding new Data Nodes when needed and re-balancing the cluster.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Involved working in Database backup and recovery, Database connectivity and security.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Utilization based on the running statistics of Map and Reduce tasks.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization.

Hadoop Admin/ Linux Administrator

Confidential - CHICAGO, IL

Responsibilities:

  • Installation and configuration of Linux for new build environment.
  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions
  • Experienced in Installation and configuration Cloudera CDH4 in testing environment.
  • Resolved tickets submitted by users, P1 issues, troubleshoot the errors, resolving the errors.
  • Balancing HDFS manually to decrease network utilization and increase job performance.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Done major and minor upgrades to the Hadoop cluster.
  • Upgraded the Cloudera Hadoop ecosystems in the cluster using Cloudera distribution packages.
  • Use of Sqoop to Import and export data from HDFS to RDMS vice-versa.
  • Done stress and performance testing, benchmark for the cluster.
  • Commissioned and decommissioned the Data Nodes in the cluster in case of the problems.
  • Debug and solve the major issues with Cloudera manager by interacting with the Cloudera team.
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Monitoring the System activity, Performance, Resource utilization.
  • Develop and optimize physical design of MySQL database systems.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents.
  • Extensive use of LVM, creating Volume Groups, Logical volumes.
  • Performed Red Hat Package Manager (RPM) and YUM package installations, patch and other server management.
  • Tested and Performed enterprise wide installation, configuration and support for hadoop using MapR Distribution.
  • Setting up cluster and installing all the ecosystem components through MapR and manually through command line in Lab Cluster
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
  • Involved in estimation and setting-up Hadoop Cluster in Linux.
  • Prepared PIG scripts to validate Time Series Rollup Algorithm.
  • Responsible for support, troubleshooting of Map Reduce Jobs, Pig Jobs and maintaining Incremental Loads at daily, weekly and monthly basis.
  • Implemented Oozie workflows for Map Reduce, Hive and Sqoop actions.
  • Channelized Map Reduce outputs based on requirement using Practitioners
  • Performed scheduled backup and necessary restoration.
  • Build and maintain scalable data using the Hadoop ecosystem and other open source components like Hive and HBase.
  • Monitor the data streaming between web sources and HDFS.

Linux/Unix Administrator

Confidential

Responsibilities:

  • Experience installing, upgrading and configuring RedHat Linux 4.x, 5.x, 6.x using Kickstart Servers and Interactive Installation
  • Responsible for creating and managing user accounts, security, rights, disk space and process monitoring in Solaris, CentOS and Redhat Linux
  • Performed administration and monitored job processes using associated commands
  • Manages systems routine backup, scheduling jobs and enabling cron jobs
  • Maintaining and troubleshooting network connectivity
  • Manages Patches configuration, version control, service pack and reviews connectivity issues regarding security problem
  • Configures DNS, NFS, FTP, remote access, and security management, Server hardening
  • Installs, upgrades and manages packages via RPM and YUM package management
  • Logical Volume Management maintenance
  • Experience administering, installing, configuring and maintaining Linux
  • Creates Linux Virtual Machines using VMware Virtual Center dministers VMware Infrastructure Client 3.5 and Vsphere 4.1
  • Installs Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems
  • Installing Red Hat Linux 5/6 using kickstart servers and interactive installation.
  • Supporting infrastructure environment comprising of RHEL and Solaris.
  • Installation, Configuration, and OS upgrades on RHEL 5.X/6.X/7.X, SUSE 11.X, 12.X.
  • Implemented and administered VMware ESX 4.x 5.x and 6 for running the Windows, Centos, SUSE and Red Hat Linux Servers on development and test servers.
  • Create, extend, reduce and administration of Logical Volume Manager (LVM) in RHEL environment.
  • Responsible for large-scale Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.

Hire Now