We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Salt Lake City, UT

SUMMARY:

IT Professional with 6+ years of experience as Unix/Linux administrator and Hadoop administrator with prowess in build and configuration management. An effective communicator, swift autodidact and constantly striving for the growth of an innovative, technologically - driven organization

CORE SKILLS AND KNOWLEDGE:

Hadoop Eco-system components: Hadoop, Map-Reduce, Yarn, Hive, Sqoop, Flume, Impala, Oozie, Spark, Kafka, Cassandra

Security tools: Kerberos, Sentry, HDFS Encryption, Navigator Encryption, KTS & KT KMS, Ranger, Knox

Operating Systems: Linux CentOS 6, 7 Redhat 6, 7

Source Control: Git lab, Bit Bucket

Programming Languages: Shell scripting, Python, HiveQL, SparkSQL

Monitoring and Alerting: Nagios, Ganglia, Ambari, Cloudera Manager, MapR Spyglass

Automation: Ansible, Jenkins

EXPERIENCE:

Confidential, Salt Lake City, UT

Hadoop Administrator

Responsibilities:

  • Involved in collecting requirements from developers, listing the sources of data and deploy data pipelines using Streamsets accordingly.
  • Installing, configuring and upgrading Streamsets Data Collector as per Developer's requirements and configuring DC's to work with DPM and MapR Cluster
  • Involved with developer’s team to analyze the slow performance of MapReduce jobs and debug the issues.
  • Design and configure the Data Lake with technologies in Hadoop ecosystem including but not limited to HDFS, YARN, Spark, Hive, Kafka, Flume, Sqoop, Hue and Sentry.
  • Worked on upgrading the MapR core and eco-system components, volumes, mirroring setup along with upgrading Client/Edge and POSIX Nodes
  • Debugging Mapr-FS core issues with gdb, mirroring, volume issues and work with MapR support team and applying patches to mitigate the issue
  • Prepared the Production and DR environments to be DR ready by upgrading volumes and configuring with proper snapshot policy
  • Used Network Monitoring daemons like Ganglia, setting up Nagios alerts for Cluster Utilization.
  • Configure and build solutions using tools such as ELK (Elastic search, Logstash, Kibana) stack to monitor, alert and report on metrics and logs using MapR’s Spyglass along with Clush tool for admin tasks.
  • Installing, configuring and Load testing Kafka by following best-practices along with monitoring using Kafka-Manager . Setup Customized Dashboards on Grafana using JMX Metrics of the Kafka Brokers
  • Setting up NFS mounts for archival and Data storage to other teams in Enterprise
  • Revamping old Hardware by provisioning them to Build New Beta - Level Clusters for different Teams
  • Automating OS installation with Kickstart configuration
  • Automating Core Hadoop installation using MapR Stanzas(Ansible) and Spyglass for Monitoring
  • Setting up an Advanced Analytics environment for Data - Scientists with RHadoop and integrating it with existing Mapreduce framework, Installing and configuring Spark 2.1.0 and integrate it with R, RStudio server, sparklyr, SparkR and Hive along with Hue and improving Job performance on Spark
  • Setting up Simility Stack Environment (for Fraud Detection) (Test & Production env's)
  • Installing and configuring Webserver, Cassandra, OrientDB, Redis, Kafka
  • Setting up Google stack-driver agent for Monitoring
  • Setting up Dashboards using Druid by consuming data from different topics of Kafka in the back-end
  • Participate in the PoC of Hadoop distributions (Cloudera, Hortonworks and MapR) to select solutions that meet business requirements.
  • Assist in exploration of newer technologies like Docker, Mesos, Kubernetes, Jenkins etc to support and enhance the Big Data Environment.
  • Setup a Testing Environment using Docker by an Ansible role to leverage for Playbook Development.
  • Setup of Data Lake using Cloudera Distribution, end to end from meeting cluster node pre-requisites, Automated Shell script Validations to Services setup and Security config (Kerberos, LDAP Auth, TLS, HDFS Encryption, KTS, KT-KMS)
  • Worked with Network, AD, firewall and f5 teams as part of different requirements for Data Lake setup.
  • Automate some of the Cluster setup, developer requirements, maintenance process, validation reports and some nitty-gritty tasks of Data Lake by developing Ansible Playbooks, Python and shell scripts accordingly.

Confidential, Irving, TX

Hadoop Administrator

Responsibilities:

  • Worked with admin & operations team to add new nodes to the production cluster before heading to freeze period.
  • Involves Firewall validation, pre-installation verification and validation checks to run smoothly along with current environment
  • Worked with admin & operations team to apply and test new patch by working with developer's team and preparing relevant documentation
  • Worked with Developers team to enable their un-secured version of Flafka Data Pipeline's to run in a secured environment by assisting in creating topics, cert's, key-stores, trust-stores along with Kerberos-tickets and relevant Flume configuration along with Documenting of each stage.

Confidential, Santa Clara, CA

Hadoop Administrator

Responsibilities:

  • Involved in collecting requirements from developers, listing the sources of data and implement different data pipelines accordingly
  • Involved with website development team and platform team, to send click stream data to Kafka cluster and Hadoop
  • Involved with developer’s team to integrate Omniture products in the whole data pipeline
  • Involved with platform team for issues in transferring data to different teams for different Hadoop clusters for further processing using Sqoop
  • Loaded data into the cluster from dynamically generated files using Flume
  • Planning the cluster capacity to data, analytics and platform team for efficient use
  • Providing maintenance support and upgrades to all the Production, QA and Dev cluster
  • Flume configuration for the transfer of data from the webservers to the HDFS
  • Providing Spark support for data scientists on Analytics cluster and improving job performance on Spark
  • Create Hive queries to perform data completeness, correctness, data transformation and data quality testing
  • Involved with developer’s team to analyze the slow performance of MapReduce jobs and debug the issues for slow performance
  • Used Network Monitoring daemons like Ganglia , Cluster Utilization and Service monitoring using Nagios
  • Knowledge in creating Shell scripts for automation of certain nitty-gritty tasks that paves way for Installation process
  • Monitor System health and logs and respond accordingly to any warning or failure conditions

Confidential

Linux Administrator

Responsibilities:

  • Provided system administration services like backup, monitoring, installation, configuration and user permissions
  • Monitoring System Metrics and logs for any problems
  • Applied Operating System updates, patches and configuration changes
  • Supporting In-house trading products built on Java
  • Managed file space and created logical volumes, extended file systems using LVM
  • Automating certain tasks by writing shell scripts
  • Performed daily maintenance of servers and tuned system for optimum performance by turning off unwanted peripheral and vulnerable service
  • Managed RPM Package for Linux distributions
  • Perform regular security monitoring to identify any possible intrusions

We'd love your feedback!