We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Minneapolis, MN

SUMMARY

  • Around 8+years of experience including 4 years in Hadoop and related technologies.
  • Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Map Reduce, YARN programming paradigm.
  • Knowledge in installing, configuring, and using Hadoop ecosystem components like
  • Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Hcatalog, Hue, Impala, Mahout, Sentry, Zookeeper, Sqoop, Pig, Flume, and Whirr.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java. .
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia .
  • Experience in installation, configuration, supporting and managing - Cloudera Hadoop Distribution - CDH3, CDH 4 and CDH 5 clusters, Hortonworks HDP 1.3, HDP 2.1 clusters.
  • Experience in configuring Zookeeper to provide Cluster coordination services.
  • Loading logs from multiple sources directly into HDFS using tools like Flume.
  • Good experience in performing minor and major upgrades.
  • Experience in benchmarking, performing backup and recovery of Name node metadata and data residing in the cluster.
  • Familiar in commissioning and decommissioning of nodes on Hadoop Cluster.
  • Adept at configuring Name Node High Availability.
  • Worked on Disaster Management with Hadoop Cluster.
  • Worked with Puppet for application deployment.
  • Hands on experience in application development using Java, RDBMS, and Knowledge in Linux shell scripting.
  • Worked on Amazon Web Services(AWS)
  • Experience with distributed computation tools such as Apache Spark, Hadoop.
  • Capturing data from existing databases that provide SQL interfaces using Sqoop.
  • Experience in Shell (bash,sh), hive, sqoop and Pig Latin scripting.
  • Extensively worked on database applications using DB2, Oracle, SQL*Plus, PL/SQL, SQL*Loader
  • Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.

SKILLS

Hadoop Eco-system: Spark,Hive, Pig, Sqoop, Flume, HBase, Hcatalog, Hue, Impala, Mahout, Oozie, Whirr, Sentry and Zookeeper,.

Operating System: Linux (RHEL, Ubuntu, CentOs), Windows (XP/7/8), VMWare

Languages: Java, C / C++, Shell scripting (bash)

Databases: MySQL, Cassandra, HBase

Tools: Rational Rose, Eclipse, NetBeans,BI & EDW

Web development: HTML, XML, CSS

Monitoring Tools: Cloudera Manager, Ganglia, Nagios

WORK EXPERIENCE:

Confidential Minneapolis, MN

SR. HADOOP ADMINISTRATOR

Responsibilities:

  • Worked on performing major upgrade of cluster from CDH3u6 to CDH4.2.0.
  • Implemented Name node High Availability on the Hadoop cluster to overcome single point of failure.
  • Installed Cloudera Manager on an already existing Hadoop cluster.
  • Involved in efficiently collecting and aggregating large amounts of streaming log data into Hadoop Cluster using Apache Flume.
  • User behavior and their patterns were studied by performing analysis on the data stored in HDFS using Hive.
  • Launched R-statistical tool for statistical computing and Graphics.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, and testing HDFS, Hive.
  • Supporting developers for deploying their jobs.
  • Installed Redis and Configured HA for it. Previously they used RabbitMQ for mirroring the data but they upgraded to Redis.
  • Geico uses distribution IBM BigInsights
  • Installed and Configured GDEfor encrpting the data
  • Upgraded BigInsights cluster from 2.1 to 4.0.
  • Cluster maintenance as well as creation and removal of nodes.
  • Monitor Hadoop cluster connectivity and security
  • Manage and review Hadoop log files.
  • File system management and monitoring.
  • Used HiveQL to write Hive queries from the existing SQL queries.
  • The analyzed data mined from huge volumes of data was exported to MySQL using Sqoop.
  • Developed custom MapReduce programs and custom User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
  • Involved in installing and configuring Kerberos to implement security to the Hadoop cluster and providing authentication for users.
  • Worked on installation of DataStax Cassandra cluster.
  • Worked with Big Data Analysts, Designers and Scientists in troubleshooting map reduce job failures and issues with Hive, Pig, and Flume etc.

Environment: Hadoop, Cloudera, Hive, Oozie, Sqoop, Flume, Cloudera Manager, Shell Script.

Confidential, Charlotte, NC

HADOOP ENGINEER

Responsibilities:

  • Upgrading the Hadoop Cluster from CDH3 to CDH4 and setup High availability Cluster Integrate the HIVE with existing applications
  • Gather the business requirements from the Business Partners ad Subject Matter Experts.
  • Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
  • Involve in installing Hadoop Ecosystem components and NoSQL ( HBase )
  • Manage/Monitor Hadoop and Ecosystem tools.
  • Responsible to manage data coming from different sources.
  • Supporting Application running on Production clusters.
  • Involve in HDFS maintenance and administration.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes
  • Implemented Fair scheduler on the job tracker to allocate fair amount of resources to small jobs.
  • Performed operating system installation, Hadoop version updates using automation tools.
  • Configured Oozie for workflow automation and coordination.
  • Implemented rack aware topology on the Hadoop cluster.
  • Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop
  • Configured ZooKeeper to implement node coordination, in clustering support.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume anddefined channel selectors to multiplex data into different sinks.
  • Worked on developing scripts for performing benchmarking with Terasort/Teragen.
  • Implemented Kerberos Security Authentication protocol for existing cluster.
  • Good experience in troubleshoot production level issues in the cluster and its functionality.
  • Backed up data on regular basis to a remote cluster using distcp.
  • Write UDFs and Map Reduce jobs using Java and Pig Latin
  • Import data using Sqoop to load data from RDBMS to HDFS on regular basis.
  • Develop scripts and batch jobs to schedule various Hadoop jobs.
  • Automating jobs using Oozie workflow engine to chain together Shell scripts, Flume, MapReduce jobs, Hive and pig scripts.
  • Write Hive queries for data analysis to meet the business requirements.
  • Create Partitioned Hive tables and work on them using Hive QL.

Enviroonment: Hadoop, MapReduce, HDFS, Hive, Java (JDK1.6), Pig, Sqoop, Impala, HCatalog, XML, MySQL

Confidential, West Chester, PA

BIG DATA ANALYST

Responsibilities:

  • Involved in review of functional and non-functional requirements.
  • Designed and implemented Map Reduce jobs for analyzing the data collected by the flume server.
  • Actively involved in working with Hadoop Administration team to debugging various slow running MR Jobs and doing the necessary optimizations.
  • Designed and implemented APIs to retrieve the data from Hadoop Platform to Web Application.
  • Created Hive tables and working on them using Hive QL
  • Used Apache Log4J for logging.
  • Involved in Bug fixing of various modules that were raised by the Testing teams in the application during the Integration testing phase.
  • Facilitated knowledge transfer sessions.

Environment: Hadoop 0.23, Cloudera CDH3, Java 1.6, Eclipse, Log4J, Pig 0.12, Hive 0.13.1, Ubuntu 12.04.x, shell scripting

Confidential

SYSTEM ADMIN

Responsibilities:

  • Installing and maintaining the Red hat and Centos Linux servers.
  • Installed centos using Pre-Execution environment boot and kick-start method on multiple servers.
  • Responsible for performance tuning and troubleshooting Linux servers.
  • Running crontab to back up data.
  • Adding, removing, updating user account information, resetting passwords, etc.
  • Maintaining the SQL server and Authentication to required users for databases.
  • Applied Operating System updates, patches and configuration changes.
  • Used different methodologies to increase the performance and reliability of the IT infrastructure.
  • Responsible for System performance tuning and successfully engineered a virtual private network (VPN).
  • Installing, configuring and maintaining SAN and NAS storage.

Confidential, PA

SYSTEM ANALYST/ADMIN

Responsibilities:

  • Worked on Installation, configuration and upgrading of Oracle server software and related products.
  • Responsible for installation, administration and maintenance of Linux servers.
  • Established and maintain sound backup and recovery policies and procedures.
  • Take care of the Database design and implementation.
  • Implement and maintain database security (create and maintain users and roles, assign privileges).
  • Performed database tuning and performance monitoring.
  • Plan growth and changes (capacity planning).
  • Worked as part of a team and provide 7x24 support when required.
  • Performed general technical trouble shooting on trouble tickets to bring to resolution.

We'd love your feedback!