We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

5.00 Rating

Irvine, CA

SUMMARY

  • Around 8 years of administration experience including 4 years on Big Data Technologies like Hadoop, Cassandra, Hive, Sqoop, Flume and Pig.
  • Involved in capacity planning for teh Hadoop cluster in production by using different distributions of Apache Hadoop, Cloudera and Hortonworks.
  • Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
  • Analyzing teh clients existing Hadoop infrastructure and understand teh performance bottlenecks and provide teh performance tuning accordingly.
  • Worked with Sqoop and Flume in Importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
  • Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
  • In depth knowledge of DataStax Cassandra and experience with installing, configuring and monitoring cluster using DatastaxOpsCenter.
  • Excellent knowledge on CQL (Cassandra Query Language) for obtaining teh data present in Cassandra by running queries in CQL.
  • Used teh DatastaxOpscenterfor maintenance operations and Keyspace and table management.
  • Experience in building scalable and fault tolerant Cassandra production database system.
  • Superior knowledge on Cassandra architecture with better understanding of read and write processes including SSTable, Mem - table and Commit log.
  • Experience in configuring Zookeeper to provide Cluster coordination services.
  • Loading logs from multiple sources directly into HDFS using tools like Flume.
  • Excellent Working knowledge of CDH4.0 including High Availability, YARN, data streaming, security, application deployment
  • Done stress and performance testing, benchmarking for teh cluster.
  • Familiar in commissioning and decommissioning of nodes on Hadoop Cluster.
  • Worked on Disaster Management with Hadoop Cluster.
  • Worked with Puppet for application deployment.
  • Strong knowledge on HadoopHDFS architecture and Map-Reduce framework.
  • Experience in understanding teh security requirements forHadoop and integrating with Kerberos autantication infrastructure- KDC server setup, creating realm /domain, managing.
  • Worked on setting up Name Node high availability for major production cluster and designed Automatic failover control using zookeeper and quorum journal nodes.

TECHNICAL SKILLS

Big Data Ecosystem: Hadoop, Cassandra, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Oozie, Flume

Security: Kerberos

UNIX tools: Apache, Yum, RPM

Databases: Oracle, MySQL, SQL Server

NoSQL Databases: Cassandra

Cluster management tools: OpsCenter,Cloudera Manager, Ambari, Ganglia, Nagios

Scripting language: Shell scripting, Puppet,chef

ETL Tools: Informatica, SSIS

BI Reporting Tools: Cognos,OBIEE

Operating Systems: WindsLinux (Redhat, CentOS, Ubuntu), WS

PROFESSIONAL EXPERIENCE

Confidential, Irvine CA

Hadoop Administrator

Responsibilities:

  • Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
  • Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
  • Provided Hadoop, OS, Hardware optimizations.
  • Setting up teh machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
  • Used TC for Network Bandwidth Control.
  • Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
  • Performed upgrade from
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes
  • Implemented Fair scheduler on teh job tracker to allocate fair amount of resources to small jobs.
  • Performed operating system installation, Hadoop version updates using automation tools.
  • Configured Oozie for workflow automation and coordination.
  • Implemented rack aware topology on teh Hadoop cluster.
  • Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop
  • Configured ZooKeeper to implement node coordination, in clustering support.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
  • Worked on developing scripts for performing benchmarking with Terasort/Teragen.
  • Implemented Kerberos Security Autantication protocol for existing cluster.
  • Good experience in troubleshoot production level issues in teh cluster and its functionality.
  • Backed up data on regular basis to a remote cluster using distcp.
  • Regular Commissioning and Decommissioning of nodes depending upon teh amount of data.
  • Monitored and configured a test cluster on Confidential for further testing process.

Confidential, Columbus, OH

Hadoop/Cassandra Administrator

Responsibilities:

  • Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design and maintained both Hadoop and Cassandra clusters.
  • Installed multi Data center cluster consisting of Cassandra rings.
  • Worked on creating teh data model for Cassandra from teh current Oracle data model.
  • Evaluated, benchmarked and tuned data model by running endurance tests using JMeter, Cassandra Stress Tool and OpsCenter.
  • Administered and maintained multi data Cassandra cluster using OpsCenter.
  • Involved closely with developers for choosing right compaction strategies and consistency levels.
  • Based on teh use case implemented consistency level for reads and writes.
  • Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
  • Provided Hadoop, OS, Hardware optimizations.
  • Used TC for Network Bandwidth Control.
  • Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
  • Upgraded teh Hadoop cluster from CDH4 to CDH5.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
  • Implemented Fair scheduler on teh job tracker to allocate fair amount of resources to small jobs.
  • Configured Oozie for workflow automation and coordination.
  • Implemented rack aware topology on teh Hadoop cluster.
  • Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop
  • Configured ZooKeeper to implement node coordination, in clustering support.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
  • Implemented Kerberos Security Autantication protocol for existing cluster.
  • Backed up data on regular basis to a remote cluster using distcp.
  • Regular Commissioning and Decommissioning of nodes depending upon teh amount of data.
  • Custom monitoring scripts for Nagios to monitor teh daemons and teh cluster status.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase,Puppet ZooKeeper, CDH4, Cassandra, Nagios, NoSQL and Unix/Linux.

Confidential, Santa Clara CA

Linux & Hadoop Administrator

Responsibilities:

  • Installed Namenode, Secondary name node, Job Tracker, Data Node, Task tracker.
  • Deployed a Hadoop cluster using cdh3 integrated with Nagios and Ganglia.
  • Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of teh Hadoop Cluster.
  • Performed Installation and configuration of Hadoop Cluster with Cloudera distribution with CDH3.
  • Implemented Commissioning and Decommissioning of data nodes, killing teh unresponsive task tracker and dealing with blacklisted task trackers.
  • Implemented Rack Awareness for data locality optimization.
  • Dumped teh data from MYSQL database to HDFS and vice-versa using SQOOP.
  • Created a local YUM repository for installing and updating packages.
  • Dumped teh data from one cluster to other cluster by using DISTCP, and automated teh dumping procedure using shell scripts.
  • Implemented Name node backup using NFS.
  • Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
  • Worked with teh Linux administration team to prepare and configure teh systems to support Hadoop deployment.
  • Created volume groups, logical volumes and partitions on teh Linux servers and mounted file systems on teh created partitions.
  • Implemented Capacity schedulers on teh Job tracker to share teh resources of teh Cluster for teh Map Reduce jobs given by teh users.
  • Worked on importing and exporting Data into HDFS and HIVE using Sqoop.
  • Worked on analyzing Data with HIVE and PIG
  • Helped in setting up Rack topology in teh cluster.
  • Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
  • Upgraded teh Hadoop cluster from CDH3 to CDH4.
  • Implemented Fair scheduler on teh job tracker to allocate teh fair amount of resources to small jobs.
  • Implemented Kerberos for autanticating all teh services in Hadoop Cluster.
  • Deployed Network file system for Name Node Metadata backup.
  • Designed and allocated HDFS quotas for multiple groups.

Environment: ClouderaManagerCDH3, MapReduce, HDFS, Sqoop, Flume, LINUX, Oozie, Hadoop, Pig, Hive, HBase, Nagios, Gangila

Confidential

Linux Administrator

Responsibilities:

  • Installing and updating packages using YUM.
  • Installing and maintaining teh Linux servers..
  • Created volume groups logical volumes and partitions on teh Linux servers and mounted file systems and created partitions.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Improve system performance by working with teh development team to analyze, identify and resolve issues quickly.
  • Ensured data recovery by implementing system and application level backups.
  • Performed various configurations which include networking and IPTable, resolving host names and SSH keyless login.
  • Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
  • Automate administration tasks through teh use of scripting and Job Scheduling using CRON.
  • Monitoring System Metrics and logs for any problems.
  • Running cron-tab to back up data.
  • Adding, removing, or updating user account information, resetting passwords, etc.
  • Using Java Jdbc to load data into MySQL.
  • Maintaining teh MySQL server and Autantication to required users for databases.
  • Support pre-production and production support teams in teh analysis of critical services and assists with maintenance operations.

We'd love your feedback!