Hadoop Admin Resume

SUMMARY:

Over 7 years of experience including 4+ years of experience with Hadoop Ecosystem in installation and administrated of all UNIX/LINUX servers and configuration of different Hadoop eco - system components in the existing cluster project.
Experience in configuring, installing and managing MapR, Hortonworks & Cloudera Distributions.
Hands on experience in installing, configuring, monitoring and using Hadoop components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Hortonworks, Oozie, Apache Spark, Impala.
Working experience with large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
Experience in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Ganglia and Nagios.
Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
Good knowledge on Data Warehousing, ETL development, Distributed Computing, and large scale data processing.
Good knowledge on implementation and design of big data pipelines.
Knowledge in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP.
Worked in agile projects delivering end to end continuous integration/continuous delivery pipeline by Integration of tools like Jenkins, puppy and AWS for VM provisioning.
Experience with complete Software Design Lifecycle including design, development, testing and implementation of moderate to advanced complex systems.
Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Hortonworks, Cloudera and Map Reduce.
Extensive experience in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP.
Excellent knowledge of NOSQL databases like HBase, Cassandra.
Experience in administration of Kafka and Flume streaming using Cloudera Distribution.
Experience in job scheduling using different schedulers like FAIR, CAPACITY & FIFO and cluster co-ordination through DISTCP tool.
Experience in designing and implementing of secure Hadoop cluster using Kerberos.
Experience in AWS CloudFront, including creating and managing distributions to provide access to S3 bucket or HTTP server running on EC2 instances.
Involved in implementing security on HDF and HDF Hadoop Clusters with Kerberos for authentication and Ranger for authorization and LDAP integration for Ambari, Ranger, and NiFi.
Responsible for support of Hadoop Production environment which includes Hive, YARN, Spark, Impala, Kafka, SOLR, Oozie, Sentry, Encryption, HBase, etc.
Migrating applications from existing systems like MySQL, Oracle, DB2 and Teradata to Hadoop.
Good hands on experience on LINUX Administration and troubleshooting issues related to Network and OS level.

TECHNICAL SKILLS:

Big Data Tools: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Flume, Oozie, Kafka, Horton work, Ambari, Knox, Phoenix, Impala, Storm.

Hadoop Distribution: Cloudera Distribution of Hadoop (CDH), Hortonworks HDP (Ambari 2.6.5)

Operating Systems: UNIX, Linux, Windows XP, Windows Vista, Windows 2003 Server

Servers: Web logic server, WebSphere and JBoss, Web Applications Tomcat and Nginx.

Programming Languages/Scripting: Java, Pl SQL, Shell Script, Perl, Python.

Tools: Interwoven Teamsite, GMS, BMC Remedy, Eclipse, Toad, SQL Server Management Studio, Jenkins, GitHub, Ranger Test NG, Junit.

Databases: MySQL, NoSQL, Couchbase, InfluxDB, Teradata, HBase, MongoDB, Cassandra, Oracle.

Processes: Systems Administration, Incident Management, Release Management, Change Management.

WORK EXPERIENCE:

Hadoop Admin

Confidential

Responsibilities:

Maintenance of the cluster, addition and removal of nodes using cluster monitoring tools (Cloudera manager and ganglia), configuring Name Node high availability and keeping a track of all running jobs.
Implementation, management and administration of the overall eco system.
Execution of day to day to running of clusters including availability on a continuous basis.
Implementation of capacity planning and estimation of requirements for lowering or increasing the capacity of cluster.
Management and review of log files.
Implementation of recovery and backup tasks.
Implementation of resource and security management.
Build new environments Dev/QA, new Hadoop cluster Cloudera 5.15.2
Enable Kerberos and TLS security
Create policies for RBAC (Role based access control list)
Creating AD groups for respective groups.
Build logical and physical diagrams for Dev/QA and provide hardware recommendation.

Hadoop Admin

Confidential, Chandler, AZ

Responsibilities:

Deployed HDInsight cluster in Microsoft Azure ranging from Dev cluster to prod cluster.
Implemented ESP (Enterprise security package) in HDInsight cluster.
Involved in creating RBAP (Role based access policies)
Created key-tabs for all the application specific service accounts.
Involved in cluster tuning Hive, Tez, Yarn, and HDFS. Spark.
Configured HIVE LLAP for better performance of the query.
Worked closely with developer team to troubleshoot all the ongoing issue
Worked on R-Studio connectivity to secured cluster HDP3.1 cluster
Worked on Zeppelin with sparklvy and hive interpreters.
Installed and configured Kerberos manager on client machine to connect secured Hadoop cluster.
Installed ggplot2 libraries on all R-Nodes for analytics team to develop graphs in zeppelin.
Involved in performance tuning of spark.livy2 interpreter for troubleshooting on going issues.
Worked on SparkSQL and hive.execute.query.

Hadoop Admin

Confidential, Denver, CO

Responsibilities:

Installed Hadoop cluster on Hortonworks distribution HDP2.6.
Implemented LDAP integration with Ambari.
Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage & review Hadoop log files.
Working as Hadoop Administrator with Hortonworks of Hadoop.
Planned and installed Hadoop clusters from Poc to Prod environment.
Implemented HA for Namenode, YARN, Hive, and Spark.
Big Data DEV & PROD cluster upgraded from HDP 2.3.x to HDP2.6.x
Big Data DEV & PROD cluster upgraded from Ambari 2.1.x to 2.6.x.x
Installed Elastic Search cluster using Ansible.
Installation, configuration and administration experience in Big Data platform Hortonworks Ambari, Apache Hadoop on RedHat, and Centos as a data storage, retrieval, and processing systems
Experience in designing, installation, configuration, supporting and managing Hadoop Clusters using Apache, Hortonworks.
Created Ansible playbook for Elastic search configuration.
Troubleshoot and debugs cluster problems on RHEL based Linux infrastructure.
Configure YARN Capacity Scheduler based on infrastructure needs/YARN tuning
Review Namenode WebUI for information concerning Datanode volume failures.
Purge older log files.
ACL's and audits to meet compliance specifications using Ranger.
Installed Kafka Manager to manage and monitor Kafka cluster. (Monitor consumer lags)

Environment: Hortonworks, Hive, HDFS, Impala, and SPARK YARN, Kafka, SOLR, Oozie, Sentry, Encryption, Hbase, etc.

Hadoop Admin

Confidential, St. Louis, MO

Responsibilities:

Deployed Hadoop cluster of Cloudera Distribution and installed ecosystem components: HDFS, Yarn, Zookeeper, HBase, Hive, MapReduce, Pig, Kafka, Confluent Kafka, Storm and Spark in Linux servers.
Configured Capacity Scheduler on the Resource Manager to provide a way to share large cluster resources.
Deployed Name Node high availability for major production cluster.
Installed Kafka cluster with separate nodes for brokers.
Experienced in writing the automatic scripts for monitoring the file systems, key MapR services.
Configured Oozie for workflow automation and coordination.
Troubleshoot production level issues in the cluster and its functionality.
Backup data on regular basis to a remote cluster using Distcp.
Involved in installing and configuring Confluent Kafka in R&D line, also Validate the installation with HDFS connector and Hive connectors.
Setting up cluster and installing all the ecosystem components through MapR and manually through command line in Lab Cluster.
Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing Zookeeper services.
Used Sqoop to connect to the ORACLE, MySQL, and Teradata and move the data into Hive /HBase tables.
Used Apache Kafka for importing real time network log data into HDFS.
Involved in installing and configuring Confluent Kafka in R&D line, also Validate the installation with HDFS connector and Hive connectors.
Performed Disk Space management to the users and groups in the cluster.
Used Storm and Kafka Services to push data to HBase and Hive tables.
Documented slides & Presentations on Confluence Page.
Used Kafka to allow a single cluster to serve as the central data backbone for a large organization.
Added Nodes to the cluster and Decommissioned nodes from the cluster whenever required.
Used Sqoop, Distcp utilities for data copying and for data migration.
Worked on end to end Data flow management from sources to NoSQL (mongo DB) Database using Oozie.
Involved with Continuous Integration team to setup tool GitHub for scheduling automatic deployments of new/existing code in Production.
Monitored multiple hadoop clusters environments using Nagios. Monitored workload, job performance and capacity planning using MapR control systems.
Monitor Hadoop cluster connectivity and security.
Manage and review Hadoop log files.
File system management and monitoring.
Used Avro SerDe for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.

Environment: Kafka, HBase, Hive, Pig, Sqoop, Yarn, Apache Oozie workflow scheduler, Flume, Zookeeper, Kerberos, Nagios, MapR.

Hadoop Admin

Confidential, Atlanta, GA

Responsibilities:

Experience in managing scalable Hadoop cluster environments.
Involved in managing, administering and monitoring clusters in Hadoop Infrastructure.
Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades when required.
Experience in HDFS maintenance and administration.
Managing nodes on Hadoop cluster connectivity and security.
Experience in commissioning and decommissioning of nodes from cluster.
Experience in Name Node HA implementation.
Working on architected solutions that process massive amounts of data on corporate and AWS cloud-based servers.
Working with data delivery teams to setup new Hadoop users.
Installed Oozie workflow engine to run multiple Map Reduce, Hive and pig jobs.
Configured Megastore for Hadoop ecosystem and management tools.
Hands-on experience in Nagios and Ganglia monitoring tools.
Experience in HDFS data storage and support for running Map Reduce jobs.
Performing tuning and troubleshooting of MR jobs by analysing and reviewing Hadoop log files.
Installing and configuring Hadoop eco system like Sqoop, Pig, Flume, and Hive.
Maintaining and monitoring clusters. Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
Experience in using distcp to migrate data between and across the clusters.
Installed and configured Zookeeper.
Hands on experience in analyzing Log files for Hadoop eco system services.
Coordinate root cause analysis efforts to minimize future system issues.
Troubleshooting of hardware issues and closely worked with various vendors for Hardware/OS and Hadoop issues.

Environment: Cloudera4.2, HDFS, Hive, Pig, Sqoop, HBase, Chef, Rhel, Mahout, Tableau, Micro strategy, Shell Scripting, Red Hat Linux.

Linux Systems Administrator

Confidential

Responsibilities:

Installation, configuration and troubleshooting of Red hat 4.x, 5.x, Ubuntu 12.x and HP-UX 11.x on various hardware platforms
Installed and configured JBoss application server on various Linux servers
Configured Kick-start for RHEL (4, and 5), Jumpstart for Solaris and NIM for AIX to perform image installation through network
Involved in configuration and troubleshooting of multipathing Solaris and bonding on Linux Servers
Configured DNS Servers and Clients, and involved in troubleshooting DNS issues
Worked with Red Hat Linux tools like RPM to install packages and patches for Red Hat Linux server
Developed scripts for automating administration tasks like customizing user environment, and performance monitoring and tuning with nfsstat, netstat, iostat and vmstat
Performed network and system troubleshooting activities under day to day responsibilities.
Created Virtual server on VMware ESX/ESXi based host and installed operating system on Guest Servers.
Installed and configured the RPM packages using the YUM Software manager.
Involved in developing custom scripts using Shell (bash, ksh) to automate jobs.
Defining and Develop plan for Change, Problem & Incident management Process based on ITIL.
Networking communication skills and protocols such as TCP/IP, Telnet, FTP, NDM, SSH, rlogin.
Installed/Configured/Maintained/Administrated the network servers DNS, NIS, NFS, SENDMAIL and application servers Apache, JBoss on Linux.
Extensive use of Logical Volume Manager (LVM), creating Volume Groups, Logical volumes and disk mirroring in HP-UX, AIX and Linux.
Working with Clusters; adding multiple IP addresses to a Servers via virtual network interface in order to minimize network traffic (load-balancing and failover clusters)
User and Security administration; addition of User into the System and Password management.
System Monitoring and log management on UNIX and Linux Servers; including, crash and swap management, with password recovery and performance tuning.

Environment: Red hat, Ubuntu, JBoss, RPM, VMware, ESXi, LVM, DNS, NIS, HP-UX.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship