We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Westborough, MA

PROFESSIONAL SUMMARY:

  • Highly Skilled Hadoop Administrator with 12+ years of overall experience in IT industry with 4+ years of experience in Hadoop administration and big data technologies, with expertise in Linux/System administration.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Cloudera distribution (CDH 5.X), Hortonworks (HDP 2.X) both on bare metal and cloud (AWS & Rackspace).
  • Supported and maintained Hadoop ecosystem components like Hadoop HDFS, YARN, MapReduce, HBase, Oozie, Hive, Sqoop, Pig, Flume, KTS, KMS, Sentry, SmartSense, Storm, Kafka, Ranger, Falcon and Knox
  • Experienced in Hadoop Cluster capacity planning, performance tuning, Monitoring, Troubleshooting
  • Experienced in setting up High Availability (HA) for diverse services in the ecosystem
  • Experienced in setting up backup and recovery policies to ensure the High Availability (HA) of clusters
  • Strong understanding of Hadoop security concepts and implementations - Authentication (Kerberos/AD/LDAP/KDC), Authorization (Sentry/Ranger) and Encryption (KTS/KMS)
  • Monitored Workload job performance and capacity planning using Cloudera Manager/ Ambari
  • Involved in enforcing standards and guidelines for the big data platform during production move
  • Responsible for onboarding new tools/technologies onto the platform
  • Experience in designing and developing data ingestion using apache NiFi, Apache Camel, Spark Streaming, Kafka, Flume, Sqoop and Shell Script
  • Experienced in analyzing Log files for Hadoop ecosystem services & analyze cause using Splunk
  • Experienced with various Tools like Clarify, JIRA, ServiceNow, New Relic, Splunk, Serena Team Track and Test Management Tools like Rational suite of Products, MS Test Manager, HP ALM
  • Experience in monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network.
  • Proficient in OS upgrades and Patch loading as and when required.
  • Expert in setting up SSH, SCP and VSFTP connectivity between UNIX hosts.
  • Closely work with different team such as Application team, networking, Linux system administrators and Enterprise service monitoring team
  • Excellent Communication, Analytical, Interpersonal, Presentation and Leadership Skills
  • Excellent in problem resolution and Root Cause Analysis & techniques
  • Good team player with excellent communication & Client interface skills.

TECHNICAL SKILLS:

BIG Data Ecosystem: HDFS, MapReduce, Spark, Pig, Hive, Hbase, sqoop, zookeeper, Sentry, Ranger, Storm, Kafka, Oozie, flume, Hue, Knox, NiFi

BIG Data Security: Kerberos, AD, LDAP, KTS, KMS, Redaction, Sentry, Ranger, Navencrypt, SSL/TLS,Cloudera Navigator

Cloud: AWS, EC2, S3, ELB, VPC, EMR

Languages/Scripting: Java, Python, VBScript, Shell, Batch, SQL, .NET, JavaScript

Other Tools: GitHub, Maven, Puppet, Chef, Clarify, JIRA, Quality Center, Rational Suite of Products, MS Test Manager, TFS, Jenkins, Confluence, Splunk, NewRelic

Operating Systems: Windows, Linux, Unix, Mac

Databases: Oracle (9i, 10g), MS SQL Server, MySQL

Tools: MS Office, Cygwin, SVN, Eclipse

PROFESSIONAL EXPERIENCE:

Confidential, Westborough, MA

Hadoop Administrator

Responsibilities:

  • Maintained and supported Cloudera Hadoop clusters in production, development & sandbox environments
  • Experienced in installation, upgradation and managing Cloudera distribution Hadoop cluster
  • Managed and reviewed Hadoop and other ecosystem log files using Splunk
  • Provisioning, installing, configuring, monitoring, and maintaining HDFS, YARN, HBase, Flume, Sqoop, Sentry, Oozie, Pig, Hive, Kafka
  • Installed and configured multiple versions spark services (Spark 1.6 and 2.0) used by different use cases
  • Worked along with developers in tuning the spark jobs for optimal configuration
  • Involved in resolving and troubleshooting cluster related issues
  • Installed new services and tools using cloudera packages as well parcels
  • Implemented authentication using Kerberos with AD/LDAP on Cloudera Hadoop Cluster
  • Implemented authorization and enforced security policies using Apache Sentry
  • Implemented encryption for data at rest using KTS/KMS and Navencrypt
  • Implemented log level masking of sensitive data using redaction
  • Developed POC to test Apache NiFi fits to import data for QFRM project
  • Involved in upgradation of Cloudera Manager and CDH
  • Resolved node failure issues and troubleshooting of common Hadoop cluster issues
  • Derived operational insights across the cluster for identifying anomalies and health check-up
  • Configured alert mechanism using SNMP traps for Cloudera Hadoop distribution
  • Involved in onboarding new users and use cases onto the CDH platform
  • Involved in deploying use cases onto production environment
  • Worked with development teams to triage issues and implement fixes on Hadoop environment and associated applications
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager
  • Governed data on the CDH cluster using Navigator for continuous optimization, audit, metadata management and policy enforcement
  • Provisioned various HDInsights clusters for differentiated workloads (Spark, H20, R, HBase on Azure portal)
  • HDInsights + H2O Cluster provisioned for DataScience prediction modeling.
  • Customized DataScience based HDInsights cluster built with all requisite python libraries
  • Created HDInsight cluster with different storages - WASB (Blob Store) and ADLS (Azure Datalake Store)
  • Explored template based HDInsights Cluster deployments for commission and decommissioning.
  • Explored runbook based HDInsights cluster deployments for automated commissioning and decommissioning.
  • Implemented encryption on Azure WASB storage (Data at Rest)

Environment: CDH 5.x, Azure HDInsights 3.5/3.6, Azure H2O, Spark, Map Reduce, Hive, Pig, Zookeeper, Nifi, HBase, Flume, Sqoop, Kerberos, Sentry, Cent OS

Confidential, Beaverton, OR

Hadoop Administrator

Responsibilities:

  • Responsible for support, implementation and ongoing administration of Hadoop infrastructure on HDP platform
  • Co-ordinate with the use case teams for new tools and services for deployment on the HDP platform
  • Involved in setting up kerberos security for authentication
  • Involved in setting up and configuring apache ranger for authorization
  • Cluster maintenance as well as creation and removal of nodes using Ambari
  • Performance tuning of Hadoop clusters and jobs related to Hive and Spark
  • Monitor Hadoop cluster connectivity and security
  • Collaborated with application teams like LSA and Networking teams to install operating system and Hadoop updates, patches, version upgrades when required
  • Involved in upgrading of Ambari and HDP
  • Involved in onboarding of new users and use cases to the HDP Platform
  • Involved in enabling HA on various services on HDP
  • Planning on requirements for migrating users to production beforehand to avoid last minute access issues
  • Planning and implementation of data migration from existing development to production cluster
  • Installed and configured Hadoop ecosystem components like PySpark, Hive, Sqoop, ZooKeeper, Oozie, Ranger, Falcon
  • Prepared multi-cluster test harness to exercise the system for performance, failover and upgrades
  • Involved in migrating data from production to development clusters using distcp
  • Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time
  • Continuous monitoring and managing the Hadoop cluster through Ganglia and Nagios.

Environment: HDP 2.5, HDFS, Mapreduce, Yarn, Spark, Hive, Pig, Flume, Oozie, Sqoop, Ambari.

Confidential, Dallas, TX

Hadoop Admin

Responsibilities:

  • Currently working as admin in Hortonworks (HDP) distribution for 4 clusters ranges from POC to PROD.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Experienced on adding/installation of new components and removal of them through Ambari.
  • Monitoring systems and services through Ambari dashboard to make the clusters available for the business.
  • Architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Hand on experience on cluster up gradation and patch upgrade without any data loss and with proper backup plans.
  • Changing the configurations based on the requirements of the users for the better performance of the jobs.
  • Experienced in Ambary-alerts configuration for various components and managing the alerts.
  • Provided security and authentication with ranger where ranger admin provides administration and user sync adds the new users to the cluster.
  • Good troubleshooting skills on Hue, which provides GUI for developer's/business users for day to day activities.
  • Experience in Ranger, Knox configuration to provide the security for Hadoop services.
  • Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Implemented Name Node HA in all environments to provide high availability of clusters.
  • Create queues and allocated the clusters resources to provide the priority for jobs.
  • Experienced in Setting up the project and volume setups for the new projects.
  • Involved in snapshots and mirroring to maintain the backup of cluster data and even remotely.
  • Implementing the SFTP for the projects to transfer data from External servers to servers.
  • Experienced in managing and reviewing log files.
  • Working experience on maintaining MySQL databases creation and setting up the users and maintain the backup of cluster metadata databases with cron jobs.
  • Setting up MySQL master and slave replications and helping business applications to maintain their data in MySQL Servers.
  • Managed and reviewed Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Monitored multiple clusters environments using AMBRI Alerts, Metrics and Nagios.

Environment: Hadoop Hdfs, Mapreduce, Hive, Scala, Pig, Flume, Nagios, Ranger, Knox, Oozie, Sqoop, Eclipse, Hortonworks, Ambari

Confidential

System Admin

Responsibilities:

  • Installed (racked), loaded Red Hat Enterprise Linux operating system (KickStart) & maintained HP & IBM& Dell servers.
  • Helped run long term Hadoop MapReduce workload batches.
  • Helped automate Hadoop batch with shell scripts and scheduled them using Cron utility.
  • Changing permissions, ownership and groups of file/folders.
  • Managing disk space using Logical Volume Management(LVM) and monitoring system performance of virtual memory, managing swap space, Disk utilization and CPU utilization. Monitoring system performance using Nagios.
  • Generating Monthly Performance Reports and updating of procedural & process documents.
  • Experience in System Builds, Server builds Installs, Upgrades, Patches, Migration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning Systems.
  • Experience in deploying virtual machines using templates and cloning, taking backup with a snapshot. Moving VM's and data stores using vMotion and storage vMotion in VMware environment.
  • Deploy virtual machines using templates and cloning.
  • Setup, configure, and maintain UNIX and Linux servers, RAID subsystems, and desktop/laptop machines including installation/maintenance of all operating system and application software.
  • Install, configure, and maintain Ethernet hubs/switches/cables, new machines, hard drive, memory, and network interface cards.
  • Manage software licenses, monitor network performance and application usage, and make software purchases
  • Provide user support including troubleshooting, repairs, and documentation and developing web site for support.
  • Configured and upgraded large disk volumes (SAS/SATA)
  • Plan and execute network security and emergency contingency programs.
  • Responsible for meeting with client and gathering business requirements for projects.
  • Create, Configure and manage standard Virtual switches.
  • Creating and configuring swatches port groups and NIC Teaming.
  • Knowledge on Resource Handling, Memory Management techniques, Fault Tolerance and Update Manager.

Environment: Redhat Linux 5.X, HP & Dell Servers, Oracle, Hadoop open source, VMWare ESX 4.x,VMware VSphere, ESX, Bash, Shell Scripting, Nagios.

Confidential

Linux Admin

Responsibilities:

  • Administration of RHEL, which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Creating, cloning Linux Virtual Machines.
  • Installing Red Hat Linux using kick start and applying security polices for hardening the server based on the company policies.
  • RPM and YUM package installations, patch and other server management.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cronjobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Tech and non-tech refresh of Linux servers, which includes new hardware, OS, upgrade, application installation, testing.
  • Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, and user and group quota.
  • Installing MySQLDB in Linux and Customize the MySQL DB parameters.
  • Working with Service Now incident tool.
  • Creating physical volumes, volume groups and logical volumes.
  • Samba Server configuration with Samba Clients.
  • Knowledge of IP tables, SELINUX.
  • Modified existing Linux file systems to a Standard EXT3.
  • Configuration and administration of NFS FTP, SAMBA, NIS.
  • Maintenance of DNS, DHCP and APACHE services on Linux machines.
  • Installing and configuring Apache and supporting them on Linux production servers.

Environment: Red-Hat Linux Enterprise servers (HP Proliant DL 585, BL ML Series, SAN (Netapp), VERITAS Cluster Server 5.0, Windows 2003 server, Shell programming.

Hire Now