We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

3.00/5 (Submit Your Rating)

Round Rock, TX

PROFESSINAL SUMMARY:

  • 3+ years of professional IT experience which includes experience in Big data ecosystem related technologies.
  • Around 3 years of hands on experience working with Hadoop, HDFS, Map Reduce framework and Hadoop ecosystem like Hive, HBase, Sqoop and Oozie.
  • Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node and MapReduce programming paradigm.
  • Hands on experience in installing, configuring, and using Hadoop components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper and Flume.
  • Extensive experience in writing Map Reduce, Hive, PIG Scripting and HDFS.
  • In - depth understanding of Data Structure and Algorithms.
  • Hands-on experience to setup >100 node Hortonworks, MapR and PHD clusters.
  • Experience in managing and reviewing Hadoop log files.
  • Installed and configured 15 node Apache Solr.
  • Experience working on Greenplum Database.
  • Knowledge in designing, implementing and managing Secure Authentication mechanism to Hadoop Cluster with Kerberos.
  • Implemented Capacity Sheduler in Hortonworks and Cloudera.
  • Installed and configured Tomcat, HTTPD Webserver, SSL, LDAP and SSO for application called Collibra(Data Governance tool).
  • Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
  • Extensive experience on data lake implementation.
  • Excellent understanding and knowledge of NOSQL databases like MongoDB, HBase, Cassandra.
  • Used Zookeeper for various types of centralized configurations.
  • Implemented in setting up standards and processes for Hadoop based application design and implementation.
  • Good Knowledge in using SPARK for real time streaming of data into the cluster.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Experience in Object Oriented Analysis and Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
  • Experience in managing Hadoop clusters using Cloudera Manager tool.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Experience in Administering, Installation, configuration, troubleshooting, Security, Backup, Performance Monitoring and Fine-tuning of Linux Redhat.
  • Hands on experience as Linux System Admin.
  • Performed Linux admin activities like patching, maintenance and software installations.
  • Experience managing groups of RHEL/Centos hosts at a scale of 100+ nodes, including installation and configuration for Hadoop cluster.
  • Knowledge in Oracle, Postgress, SQL Server and My SQL database.
  • Hands on experience in VPN, Putty, winSCP, VNCviewer, etc.
  • Scripting to deploy monitors, checks and critical system admin functions automation.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Ability to adapt to evolving technology, strong sense of responsibility and .

TECHNICAL SKILLS:

Big Data Ecosystem: HDFS, HBase, Hadoop MapReduce, Zookeeper, HivePig, Sqoop, Flume, Oozie, Kafka, Cassandra, Spark

Hadoop Distributions: Cloudera, MapR, Hortonworks,AWS EMR and PHD

Languages: C, C++, Java, SQL/PLSQL

Methodologies: Agile, waterfall.

Database: Oracle 10g, DB2, MySQL, MongoDB, CouchDB, MS

IDE / Testing Tools: Eclipse, STS.

Operating System: Windows, UNIX, Linux

Scripts: JavaScript, Shell Scripting

PROFESSIONAL EXPERIENCE:

Hadoop Administrator

Confidential, Round Rock, TX

Responsibilities:

  • Engineer in Big Data team, worked with Hadoop and its Ecosystem.
  • Installed and configured Hadoop ecosystem like HBase, Flume, Pig, Hive, Oozie and Sqoop.
  • Developed Hive queries to do analysis of the data and to generate the end reports to be used by business users.
  • Managing 100+ Node HDP and Cloudera clusters daily basis.
  • Worked on shell scripts for automation.
  • Implemented YARN ACL Checks, Ranger and Kerberos for HDP and Cloudera Hadoop cluster.
  • Worked on DistCp data transfer jobs schedule from Production cluster to Analytics cluster on daily basis for Analytics team reports purpose.
  • Involved in writing sqoop jobs to migrate data from Oracle, sql server to hdfs
  • Investigated and Implemented Hortonworks Smartsence recommendations.
  • Implemented Bug fixes on Hive and Tez as per Hortonworks recommendations.
  • Involved on LDAP configuration for Cloudera Manager, Ambari and Hue.
  • Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data
  • Managed and reviewed Hadoop Log files
  • Extensively used Oozie scheduler, clear understanding of Oozie workflows, coordinators and Bundles.
  • Worked extensively with Sqoop for importing metadata from Oracle
  • Installed and configured large scale Kafka cluster in Cloudera.
  • Responsible for smooth error-free configuration of DWH-ETL solution and Integration with Hadoop
  • Designed a data warehouse using Hive
  • Used Control-m scheduling tool to schedule the daily jobs.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Configured Splunk to generate alerts over system/service failures.
  • Working as a Production code deployment engineer in sprint basis.

Environment: Hadoop, HDP-2.8, Hortonworks,Cloudera 5.15, MapReduce, HDFS, Hive, HBase, Kafka, Java 7 & 8, MongoDB, Pig, Informatica, Oracle, Informatica BDM, Linux, Eclipse, Zookeeper, Apache Solr, R and Rstudio,Control-M, Redis, Tableau, Qlikview, DataStax, Spark, Splunk

Hadoop Administrator

Confidential

Responsibilities:

  • Responsible for Cluster Maintenance, Monitoring, Managing, Commissioning and decommissioning Data nodes, Troubleshooting, and review data backups, Manage & review log files for Cloudera and Horton works.
  • Adding/Installation of new components and removal of them through Cloudera and Horton works.
  • Monitoring workload, job performance, capacity planning using Cloudera.
  • Major and Minor upgrades and patch updates.
  • Creating and managing the Cron jobs.
  • Installed Hadoop eco system components like Pig, Hive, HBase and Sqoop in a Cluster.
  • Experience in setting up tools like Nagios for monitoring Hadoop cluster.
  • Handling the data movement between HDFS and different web sources using Flume and Sqoop.
  • Extracted files from SQL database like MSSQL, ORACLE through Sqoop and placed in HDFS for processing.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Building and maintaining scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
  • Installed and configured HA of Hue to point Hadoop Cluster in Cloudera Manager.
  • Have deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment, supporting and managing Hadoop Clusters.
  • Installed and configured Map Reduce, HDFS and developed multiple Map Reduce jobs in Hive for data cleaning and pre-processing.
  • Kafka- Used for building real-time data pipelines between clusters.
  • Ran Log aggregations, website Activity tracking and commit log for distributing system using Apache kafka.
  • Working with applications teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Experience in Python and Shell scipts.
  • Commissioning Data Nodes when data grew and De-commissioning of data nodes from cluster in hardware degraded.
  • Working with data delivery teams to setup new Hadoop users, Linux users, setting up Kerberos principles and testing HDFS, Hive.
  • Discussions with other technical teams on regular basis regarding upgrades, process changes, any special processing and feedback.

Environment: Hadoop, Cloudera, Hortanworks, MapReduce, HDFS, Hive, HBase, Java 6 & 7, MongoDB, Pig, Informatica, Oracle, Informatica BDM, Linux, Eclipse, Zookeeper, Apache Solr, R and Rstudio,Control-M, Redis, Tableau, Spark, Splunk

We'd love your feedback!