We provide IT Staff Augmentation Services!

Hadoop Admin Resume

5.00/5 (Submit Your Rating)

San Francisco, CA

SUMMARY:

  • Experience which includes proven experience in Hadoop Administration on Cloudera (CDH), Hortonworks (HDP) Distributions, Vanilla Hadoop, MapR and experience in AWS, Kafka, Elasticsearch, Devops and Linux Administration.
  • Good experience as Software Engineer with IT Technologies and good working knowledge in Java and BIG Data Hadoop Ecosystems.
  • Good experience in Hadoop infrastructure which include Map reduce, Hive, Oozie, Sqoop, HBase, Pig, HDFS, Yarn, Spark. Impala configuration projects in direct client facing roles.
  • Good knowledge on Data Structure, Algorithms, Object Oriented Design and Data Modelling
  • Good knowledge in Core Java programming using Collections, Generics, Exception handling, multithreading.
  • Good knowledge on Data Warehousing, ETL development, Distributed Computing, and large scale data processing.
  • Experience integration of Kafka with Spark for real time data processing and Spark Streamin
  • Good knowledge on implementation and design of big data pipelines.
  • Knowledge in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP.
  • Knowledge in implementing ETL/ELT processes with MapReduce, PIG, Hive
  • Hands - on experience on major components in Hadoop Ecosystem including Hive, HBase, HBase & Hive Integration, Sqoop, Flume & knowledge of Mapper/Reduce/HDFS Framework.
  • Extensive experience in installing, configuring and administrating Hadoop cluster for major Hadoop distributions like CDH5 and HDP
  • Worked on NoSQL databases including HBase, Cassandra and MongoDB.
  • Strong knowledge on creating and monitoring Hadoop cluster on VM, Hortonworks Data Platform 2.1 7 2.2, CDH3, CDH4 Cloudera Manager on Linux, Ubuntu OS.
  • Hands-on experience on major components in Hadoop Ecosystem including Hive, HBase, HBase & Hive Integration, Sqoop, Flume & knowledge of Mapper/Reduce/HDFS Framework.
  • Knowledge on MS SQL Server … and Oracle … Knowledge in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Strong knowledge in Software Development Life Cycle (SDLC)
  • Strong knowledge on creating and monitoring Hadoop cluster on VM, Hortonworks Data Platform 2.1 7 2.2, CDH3, CDH4 Cloudera Manager on Linux, Ubuntu OS.
  • Strong understanding in Agile and Waterfall SDLC methodologies.
  • Experienced in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Good Knowledge in creating reports using Qlik View/ Qlik Scenes.
  • Experienced in installing, configuring and administrating Hadoop Clusters.

TECHNICAL SKILLS:

Big Data Tools: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Flume, Oozie, Kafka, Horton work, Ambari, Knox, Phoniex, Impala, Storm.

Hadoop Distribution: Cloudera Distribution of Hadoop (CDH).

Operating Systems: UNIX, Linux, Windows XP, Windows Vista, Windows 2003 Server

Servers: Web logic server, WebSphere and JBoss.

Programming Languages: Java, Pl SQL, Shell Script, Perl, Python.

Tools: Interwoven Teamsite, GMS, BMC Remedy, Eclipse, Toad, SQL Server Management Studio, Jenkins, GitHub, Ranger Test NG, Junit.

Database: MySQL, NoSQL, Couchbase, InfluxDB, Teradata, HBase, MongoDB, Cassandra, Oracle.

Processes: Incident Management, Release Management, Change Management.

PROFESSIONAL EXPERIENCE:

Confidential, San Francisco, CA

Hadoop Admin

Responsibilities:

  • Administration & Monitoring Hadoop.
  • Worked on Hadoop Upgradation from 4.5 to 5.2.
  • Monitor Hadoop cluster job performance and capacity planning.
  • DevOps configuration management with Ansible.
  • Installed Ansible 2.3.0 in Production Environment
  • Upgraded Elastic search from 5.3.0 to 5.3.2 following the rolling upgrade process and using ansible to deploy new packages in Prod Cluster.
  • Implemented and managed for Devops infrastructure architecture, Terraform, Jenkins, Puppet and Ansible implementation, Responsible for CI infrastructure and CD infrastructure and process and deployment strategy.
  • Release Process Implementation like DevOps and Continuous Delivery methodologies to existing Build & Deployment Strategies.
  • Used Apache Nifi for ingestion of data from the IBM MQ's (Messages Queue)
  • Implemented Nifi flow topologies to perform cleansing operations before moving data into HDFS.
  • Started using Apache NiFi to copy the data from local file system to HDP
  • Worked with Nifi for managing the flow of data from source to HDFS.
  • Experience in job workflow scheduling and scheduling tools like Nifi.
  • Ingested data into HDFS using Nifi with different processors, developed custom Input Adaptors
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI .
  • Configured Spark Streaming to receive real time data from the Kafka and store the stream data to HDFS.
  • Successfully Generated consumer group lags from kafka using their API Kafka- Used for building real-time data pipelines between clusters.
  • Ran Log aggregations, website Activity tracking and commit log for distributed system using Apache Kafka
  • Integrated Apache Kafka for data ingestion
  • Experience integration of Kafka with Spark for real time data processing and Spark Streaming.
  • Removing from monitoring of particular security group nodes in nagios in case of retirement.
  • Designed and implemented by configuring Topics in new Kafka cluster in all environment.
  • Successfully secured the Kafka cluster with Kerberos Implemented Kafka Security Features using SSL and without Kerberos. Further with more grain-fines Security I set up Kerberos to have users and groups this will enable more advanced security features.
  • Responsible for managing and scheduling jobs on Hadoop Cluster
  • Replacement of Retired Hadoop slave nodes through AWS console and Nagios Repositories
  • Performed dynamic updates of Hadoop Yarn and MapReduce memory settings
  • Worked with DBA team to migrate Hive and Oozie meta store Database from MySQL to RDS
  • Worked with fair and capacity schedulers, creating new queues, adding users to queue, Increase mapper and reducers capacity and also administer view and submit Mapreduce jobs
  • Experience in Administration/Maintenance of source control management systems, such as GIT and GITHUB knowledge
  • Hands on experience in installing and administrating CI tools like Jenkins
  • Experience in integrating Shell scripts using Jenkins
  • Installed and configured an automated tool Puppet that included the installation and configuration of the Puppet master, agent nodes and an admin control workstation.
  • Working with Modules, Classes, Manifests in Puppet.
  • Experience in creating Docker images
  • Used containerization technologies like Docker for building clusters for orchestrating containers deployment.
  • Operations - Custom Shell scripts, VM and Environment management.
  • Experience in working with Amazon EC2, S3, Glaciers
  • Create multiple groups and set permission polices for various groups in AWS
  • Experience in creating life cycle policies in AWS S3 for backups to Glaciers
  • Experience in maintaining, executing, and scheduling build scripts to automate DEV/PROD builds.
  • Configured Elastic Load Balancers with EC2 Auto scaling groups.
  • Created monitors, alarms and notifications for EC2 hosts using Cloudwatch.
  • Launching Amazon EC2 Cloud Instances using Amazon Images (Linux/Ubuntu) and configuring launched instances with respect to specific applications
  • Worked with IAM service creating new IAM users & groups, defining roles and policies and Identity providers
  • Experience in assigning MFA in AWS using IAM and s3 buckets
  • Defined AWS Security Groups which acted as virtual firewalls that controlled the traffic allowed to reach one or more AWS EC2 instances.
  • AmazonRoute53 to oversee DNS zones and furthermore give open DNS names to flexible load balancers IP.
  • Using default and custom VPCs to create private cloud environments with public and private subnets
  • Loaded data from Oracle, MS SQL Server, MySQL, Flat File database into HDFS, HIVE
  • Fixed Namenode partition failed, fsimage not rotated, MR job failed with too many fetch failures and troubleshooting common Hadoop cluster issues
  • Implemented manifest files in puppet for automated orchestration of Hadoop and Cassandra clusters
  • Maintaining Github repositories for Configuration Management
  • Configured distributed monitoring system Ganglia for Hadoop clusters
  • Managing cluster coordination services through Zoo Keeper
  • Configured and deployed Namenode High Availability Hadoop cluster with SSL and kerberoized
  • Deal with the several services restart and killing the process with Pid to clear the alert
  • Monitoring Log files of several services, clear files incase of Diskspace issues on share this nodes
  • Successfully upgraded Hortonworks Hadoop distribution stack from 2.3.4 to 2.5.
  • Currently working as admin in Hortonworks (HDP 2.2.4.2) distribution for 4 clusters ranges from POC to PROD.
  • Cluster Administration, releases and upgrades Managed multiple Hadoop clusters with the highest capacity of 7 PB (400+ nodes) with PAM Enabled Worked on Hortonworks Distribution.
  • Experience in Python Scripting.
  • Orchestrated hundreds of Sqoop scripts, python scripts, Hive queries using Oozie workflows and sub- workflows.
  • Used Change management and Incident management process following organization guidelines.
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI
  • Set up Hortonworks Infrastructure from configuring clusters to Node
  • Extensive experience in cluster planning, installing, configuring and administrating Hadoop cluster for major Hadoop distributions like Cloudera and Hortonworks.
  • Installing, Upgrading and Managing Hadoop Cluster on Hortonworks
  • Hands on experience using Cloudera and Hortonworks Hadoop Distributions.
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform, NIFI
  • Set up Hortonworks Infrastructure from configuring clusters to Node
  • Worked on installing and configuring of CDH 5.8, 5.9 and 5.10 Hadoop Cluster on AWS using Cloudera Director, Cloudera Manager.

Confidential, Oak Book, IL

Hadoop Administrator

Responsibilities:

  • Working on 4 Hadoop clusters for different teams, supporting 50+ users to use Hadoop platform, provide training to users to make Hadoop usability simple and updating them for best practices.
  • Implemented an instance of Zookeeper for Kafka Brokers.
  • Implementing Hadoop Security on Hortonworks Cluster using Kerberos and Two-way SSL
  • Experience with Hortonworks, Cloudera CDH4 and CDH5 distributions
  • Contributed to building hands-on tutorials for the community to learn how to use Hortonworks Data Platform (powered by Hadoop) and Hortonworks Dataflow (powered by NiFi) covering categories such as Hello World, Real-World use cases, Operations.
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Responsible for building a cluster on HDP 2.3. With Hadoop 2.2.0 using Ambari.
  • Responsible for implementation and ongoing administration of Hadoop administration.
  • Involved in Performance testing of the Production Cluster using TERAGEN, TERASORT and TERAVALIDATE.
  • Implemented commissioning and decommissioning of data nodes.
  • Involved in Importing and exporting data into HDFS and Hive using Sqoop.
  • Experienced in managing and reviewing Hadoop log files.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map.
  • Managed 350+ Nodes HDP 2.3 cluster with 4 peta bytes of data using Ambari 2.0 and Linux Cent OS 7.
  • Implemented Fair scheduler on the Resource Manager to allocate the fair amount of resources to small jobs.
  • Installed and configured Hive Using Hive Metastore, Hiveserver2 and HCatalog.
  • Created method of process for the Kerberos KDC cluster Setup

Confidential, St. Louis, MO

Bigdata Operations Engineer - Consultant

Responsibilities:

  • Responsible for implementation and ongoing administration of Hadoop infrastructure.
  • Using Hadoop cluster as a staging environment for the data from heterogeneous sources in data import process
  • Configured High Availability on the name node for the Hadoop cluster - part of the disaster recovery roadmap.
  • Configured Ganglia and Nagios to monitor the cluster and on-call with EOC for support.
  • Involved working on Cloud architecture.
  • Performed both Major and Minor upgrades to the existing cluster and also rolling back to the previous version.
  • Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Set up Hortonworks Infrastructure from configuring clusters to Node
  • Installed Ambari server on the clouds
  • Setup security using Kerberos and AD on Hortonworks clusters
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log Data from Many different sources to the HDFS.
  • Involved working in Database backup and recovery, Database connectivity and security.

Confidential, Chicago, IL

Hadoop Admin/ Linux Administrator

Responsibilities:

  • Installation and configuration of Linux for new build environment.
  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions
  • Experienced in Installation and configuration Cloudera CDH4 in testing environment.
  • Resolved tickets submitted by users, P1 issues, troubleshoot the errors, resolving the errors.
  • Balancing HDFS manually to decrease network utilization and increase job performance.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Done major and minor upgrades to the Hadoop cluster.
  • Upgraded the Cloudera Hadoop ecosystems in the cluster using Cloudera distribution packages.
  • Use of Sqoop to Import and export data from HDFS to RDMS vice-versa.
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Monitoring the System activity, Performance, Resource utilization.
  • Develop and optimize physical design of MySQL database systems.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Performed Red Hat Package Manager (RPM) and YUM package installations, patch and other server management.
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.

Confidential

Linux/Unix Administrator

Responsibilities:

  • Experience installing, upgrading and configuring RedHat Linux 4.x, 5.x, 6.x using Kickstart Servers and Interactive Installation
  • Responsible for creating and managing user accounts, security, rights, disk space and process monitoring in Solaris, CentOS and Redhat Linux
  • Performed administration and monitored job processes using associated commands
  • Manages systems routine backup, scheduling jobs and enabling cron jobs
  • Maintaining and troubleshooting network connectivity
  • Manages Patches configuration, version control, service pack and reviews connectivity issues regarding security problem
  • Configures DNS, NFS, FTP, remote access, and security management, Server hardening
  • Installs, upgrades and manages packages via RPM and YUM package management
  • Logical Volume Management maintenance
  • Experience administering, installing, configuring and maintaining Linux
  • Creates Linux Virtual Machines using VMware Virtual Center
  • Administers VMware Infrastructure Client 3.5 and vSphere 4.1
  • Installs Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems
  • Installing Red Hat Linux 5/6 using kickstart servers and interactive installation.
  • Supporting infrastructure environment comprising of RHEL and Solaris.
  • Installation, Configuration, and OS upgrades on RHEL 5.X/6.X/7.X, SUSE 11.X, 12.X.
  • Implemented and administered VMware ESX 4.x 5.x and 6 for running the Windows, Centos, SUSE and Red Hat Linux Servers on development and test servers.
  • Create, extend, reduce and administration of Logical Volume Manager (LVM) in RHEL environment.
  • Responsible for large-scale Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.

We'd love your feedback!