We provide IT Staff Augmentation Services!

Senior Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Phoenix, AZ

SUMMARY

  • Around 6+ years of professional IT experience in all phases of Software Development Life Cycle which includes hands on experience in Big Data Analytics.
  • 6+ year’s experience installing, configuring, testing Hadoop ecosystem components in all distributions.
  • 6+ years of comprehensive experience as a Techno - Functional Hadoop Data Analyst in Finance sectors.
  • Excellent in writing business and system specifications, designing and developing Use case, Activity, Interaction (Sequence & Collaboration), and Case Diagrams.
  • Capable of processing huge amounts of structured, semi-structured and unstructured data.
  • Experience with Sequence files, Parquet and JSON formats and compression.
  • Expertise in Hive Query Language, debugging hive issues HIVE Security and Hadoop Security.
  • Very good understanding on NoSQL databases like MongoDB, Cassandra and HBase.
  • Skilled in creating workflows using Oozie for cronjobs.
  • Experienced in writing custom UDFs and UDAFs for extending Hive and Pig core functionality.
  • Good atHBaserelated Architecture Design, like batch data analysis system or near real-time data.
  • Experience in using various network protocols like HTTP, UDP, POP, FTP, TCP/IP, and SMTP.
  • Familiar with various Hadoop distributions MapR, Couldera and Apache.
  • Experience in managing Hadoop clusters using Cloudera Manager.
  • Installing and monitoring the Hadoop cluster resources using Grafana, Splunk Ganglia and Nagios.
  • Expertise in cluster coordination services through Zookeeper.
  • Managing the cluster resources by implementing Fair scheduler and Capacity scheduler.
  • Working knowledge with ETL application architecture, including data ingestion/transformation pipeline design, data modeling and data mining, machine learning, and advanced data processing.
  • Experienced in implementingPuppet, Salt, Chef and used JIRA for Bug and issue tracking.
  • Used bulk loadHBaseApi to load the created HFiles intoHBasefor faster access of large customer base without taking performance.
  • Excellent analytical, problem solving, communication and interpersonal skills with ability to interact with individuals at all levels and can work as a part of a team as well as independently.

TECHNICAL SKILLS

Hadoop Ecosystem Components: MapR, HDFS, Map-Reduce, Pig, Hive, Zookeeper, Hbase, Yarn, Spark, Storm, Kafka, Rev R, Splunk.

Languages and Technologies: Core Java, C, C++ OS, Tools & Methodologies Windows, UNIX, Linux (Ubuntu, Fedora), Mac OS, MS office 2010, Netbeans, JIRA, Jenkins, Eclipse, Adobe Professional, Rational Rose (RR), MS Visio, Agile, Waterfall, Scrum, RUP, RAD

Scripting Languages: Bash, Python and Shell scripting

Networking: TCP/IP Protocol, Switches & Routers, OSI Architecture, HTTP, NTP & NFS

Databases: MySQL, NoSQL

IDE: Eclipse

Other: Puppet, Salt, Chef, Stack IQ, Nagios, Ganglia, Splunk

PROFESSIONAL EXPERIENCE

Senior Hadoop Administrator

Confidential, Phoenix, AZ

Responsibilities:

  • Responsible for Supporting & Maintaining 3000 Plusinfrastructure servershosting critical Realtime applications with Zero application downtime on MapR clusters.
  • Responsible for Cluster maintenance, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review logs, snapshots to ensure high availability of clusters.
  • Good knowledge of warden, node labels, MCS, CLDB, Storage pools, Volumes, NFS, DataXferand Snapshots, mirrors which are part of MapR Architecture
  • Upgrading/ Patching clusters, ecosystem components (MapR, Hadoop Eco system components that include Hive, Spark, Pig, Sqoop, Oozie, Tez. Etc.,) at periodic intervals or as recommended to keep the environment latest, secure and competent.
  • Proactive planning to analyze Infrastructure failures, identify root causes, recommend course of actions, and monitoring application trends via BI tools (MicroStrategy, Splunk, Datameer & Kibana) to avoid potential impacts that could cause global outages
  • Responsible in ensuring the critical application data and configurations to be mirrored/ snapshotted on to a remote Disaster Recovery cluster (DR) for business continuity of tier -1 applications during a disaster event
  • Coordinated with various critical use case teams to test their applications in Disaster Recovery cluster in case of Production outages.
  • Actively involved in performing benchmark tests on MapR clusters for analyzing compatibility of clusters with various system hardware configurations/ cluster properties for performance and recommendations based on benchmark results.
  • Involved inHadoop Cluster capacity planning and expansion every year by adding multiple UCS in each cluster
  • Resolved tickets/escalations/incidents created in Service Now and JIRA, through root cause analysis, in adherence to SLA, quality, process & security standards to meet the business requirements.
  • Debugging user issues across various Hadoop Eco system tools like Hive, Spark, HBase, Kafka, Oozie, Tez etc.
  • Implemented custom scripts to automate day to day BAU activities like auto healing the alarm in MCS.
  • Automated various System stats which are required to identify issues in a node to avoid any outage in the clusters.
  • Used shell scripts to alert against long running jobs, jobs which are writing heavy login to HDFS, and emails are sent automatically to respective use case.
  • Used python script to automate monthly ciso patching script which is applied on Hadoop clusters on rolling fashion on monthly bases to fix vulnerability that are list by the team.
  • Implemented custom scripts in Puppet to mimic the properties of different set of nodes for future nodes.
  • Performed POC and Implemented high availability for History server & Spark history server.
  • Implemented Multiple Spark version in the same clusters to provide various features to use cases.
  • Set high availability of Oozie and Tez UI
  • Implemented CGroups and CAS on data nodes to increase the performance of the nodes.
  • Commissioned and decommissioned queues as per the business requirements.
  • Actively engaged with third party tools on Big Data and MapR Eco system components e.g. Jethro, R, Unravel, Pepper data, Dr.Elephant etc
  • Performed cluster upgrade from MapR 4.0.2 to 5.2, 5.2 to 5.2.2 and 5.2.2 to 6.1.
  • Experience in controlling, managing and creating VIP’s F5 Big IP and HA proxy.
  • Actively engage and manage ITSM process, Major Incident Management, Release Management, Event Management and Configuration Management. Handling inevitable IT service disruptions, correlating heterogeneous environment including Mapr Hadoop, VMware, Cisco UCS systems and Cisco Enterprise Networks, driving various technical teams in the direction of quicker service restoration.
  • Regular ongoing cluster maintenance, Health checks, commissioning Data nodes, data nodes balancing.
  • Perform Knowledge Transfer for offshore Hadoop L1 support including documentation on environments, monitoring requirements, access & communication process.

Environment: Cloak Hive, MapR 4.0.2 to 6.2, Spark, MySQL, MCS, Pig, Oozie, HBase, flume, Hive.

We'd love your feedback!