Hadoop Administrator Resume Irvine CA - Hire IT People

SUMMARY

Around 8 years of administration experience including 4 years on Big Data Technologies like Hadoop, Cassandra, Hive, Sqoop, Flume and Pig.
Involved in capacity planning for teh Hadoop cluster in production by using different distributions of Apache Hadoop, Cloudera and Hortonworks.
Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
Analyzing teh clients existing Hadoop infrastructure and understand teh performance bottlenecks and provide teh performance tuning accordingly.
Worked with Sqoop and Flume in Importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
In depth knowledge of DataStax Cassandra and experience with installing, configuring and monitoring cluster using DatastaxOpsCenter.
Excellent knowledge on CQL (Cassandra Query Language) for obtaining teh data present in Cassandra by running queries in CQL.
Used teh DatastaxOpscenterfor maintenance operations and Keyspace and table management.
Experience in building scalable and fault tolerant Cassandra production database system.
Superior knowledge on Cassandra architecture with better understanding of read and write processes including SSTable, Mem - table and Commit log.
Experience in configuring Zookeeper to provide Cluster coordination services.
Loading logs from multiple sources directly into HDFS using tools like Flume.
Excellent Working knowledge of CDH4.0 including High Availability, YARN, data streaming, security, application deployment
Done stress and performance testing, benchmarking for teh cluster.
Familiar in commissioning and decommissioning of nodes on Hadoop Cluster.
Worked on Disaster Management with Hadoop Cluster.
Worked with Puppet for application deployment.
Strong knowledge on HadoopHDFS architecture and Map-Reduce framework.
Experience in understanding teh security requirements forHadoop and integrating with Kerberos autantication infrastructure- KDC server setup, creating realm /domain, managing.
Worked on setting up Name Node high availability for major production cluster and designed Automatic failover control using zookeeper and quorum journal nodes.

TECHNICAL SKILLS

Big Data Ecosystem: Hadoop, Cassandra, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Oozie, Flume

Security: Kerberos

UNIX tools: Apache, Yum, RPM

Databases: Oracle, MySQL, SQL Server

NoSQL Databases: Cassandra

Cluster management tools: OpsCenter,Cloudera Manager, Ambari, Ganglia, Nagios

Scripting language: Shell scripting, Puppet,chef

ETL Tools: Informatica, SSIS

BI Reporting Tools: Cognos,OBIEE

Operating Systems: WindsLinux (Redhat, CentOS, Ubuntu), WS

PROFESSIONAL EXPERIENCE

Confidential, Irvine CA

Hadoop Administrator

Responsibilities:

Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
Provided Hadoop, OS, Hardware optimizations.
Setting up teh machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
Used TC for Network Bandwidth Control.
Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
Performed upgrade from
Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes
Implemented Fair scheduler on teh job tracker to allocate fair amount of resources to small jobs.
Performed operating system installation, Hadoop version updates using automation tools.
Configured Oozie for workflow automation and coordination.
Implemented rack aware topology on teh Hadoop cluster.
Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop
Configured ZooKeeper to implement node coordination, in clustering support.
Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
Worked on developing scripts for performing benchmarking with Terasort/Teragen.
Implemented Kerberos Security Autantication protocol for existing cluster.
Good experience in troubleshoot production level issues in teh cluster and its functionality.
Backed up data on regular basis to a remote cluster using distcp.
Regular Commissioning and Decommissioning of nodes depending upon teh amount of data.
Monitored and configured a test cluster on Confidential for further testing process.

Confidential, Columbus, OH

Hadoop/Cassandra Administrator

Responsibilities:

Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design and maintained both Hadoop and Cassandra clusters.
Installed multi Data center cluster consisting of Cassandra rings.
Worked on creating teh data model for Cassandra from teh current Oracle data model.
Evaluated, benchmarked and tuned data model by running endurance tests using JMeter, Cassandra Stress Tool and OpsCenter.
Administered and maintained multi data Cassandra cluster using OpsCenter.
Involved closely with developers for choosing right compaction strategies and consistency levels.
Based on teh use case implemented consistency level for reads and writes.
Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
Provided Hadoop, OS, Hardware optimizations.
Used TC for Network Bandwidth Control.
Installed and configured Cloudera Manager for easy management of existing Hadoop cluster.
Upgraded teh Hadoop cluster from CDH4 to CDH5.
Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
Implemented Fair scheduler on teh job tracker to allocate fair amount of resources to small jobs.
Configured Oozie for workflow automation and coordination.
Implemented rack aware topology on teh Hadoop cluster.
Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop
Configured ZooKeeper to implement node coordination, in clustering support.
Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
Implemented Kerberos Security Autantication protocol for existing cluster.
Backed up data on regular basis to a remote cluster using distcp.
Regular Commissioning and Decommissioning of nodes depending upon teh amount of data.
Custom monitoring scripts for Nagios to monitor teh daemons and teh cluster status.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase,Puppet ZooKeeper, CDH4, Cassandra, Nagios, NoSQL and Unix/Linux.

Confidential, Santa Clara CA

Linux & Hadoop Administrator

Responsibilities:

Installed Namenode, Secondary name node, Job Tracker, Data Node, Task tracker.
Deployed a Hadoop cluster using cdh3 integrated with Nagios and Ganglia.
Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of teh Hadoop Cluster.
Performed Installation and configuration of Hadoop Cluster with Cloudera distribution with CDH3.
Implemented Commissioning and Decommissioning of data nodes, killing teh unresponsive task tracker and dealing with blacklisted task trackers.
Implemented Rack Awareness for data locality optimization.
Dumped teh data from MYSQL database to HDFS and vice-versa using SQOOP.
Created a local YUM repository for installing and updating packages.
Dumped teh data from one cluster to other cluster by using DISTCP, and automated teh dumping procedure using shell scripts.
Implemented Name node backup using NFS.
Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
Worked with teh Linux administration team to prepare and configure teh systems to support Hadoop deployment.
Created volume groups, logical volumes and partitions on teh Linux servers and mounted file systems on teh created partitions.
Implemented Capacity schedulers on teh Job tracker to share teh resources of teh Cluster for teh Map Reduce jobs given by teh users.
Worked on importing and exporting Data into HDFS and HIVE using Sqoop.
Worked on analyzing Data with HIVE and PIG
Helped in setting up Rack topology in teh cluster.
Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
Upgraded teh Hadoop cluster from CDH3 to CDH4.
Implemented Fair scheduler on teh job tracker to allocate teh fair amount of resources to small jobs.
Implemented Kerberos for autanticating all teh services in Hadoop Cluster.
Deployed Network file system for Name Node Metadata backup.
Designed and allocated HDFS quotas for multiple groups.

Environment: ClouderaManagerCDH3, MapReduce, HDFS, Sqoop, Flume, LINUX, Oozie, Hadoop, Pig, Hive, HBase, Nagios, Gangila

Confidential

Linux Administrator

Responsibilities:

Installing and updating packages using YUM.
Installing and maintaining teh Linux servers..
Created volume groups logical volumes and partitions on teh Linux servers and mounted file systems and created partitions.
Deep understanding of monitoring and troubleshooting mission critical Linux machines.
Improve system performance by working with teh development team to analyze, identify and resolve issues quickly.
Ensured data recovery by implementing system and application level backups.
Performed various configurations which include networking and IPTable, resolving host names and SSH keyless login.
Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
Automate administration tasks through teh use of scripting and Job Scheduling using CRON.
Monitoring System Metrics and logs for any problems.
Running cron-tab to back up data.
Adding, removing, or updating user account information, resetting passwords, etc.
Using Java Jdbc to load data into MySQL.
Maintaining teh MySQL server and Autantication to required users for databases.
Support pre-production and production support teams in teh analysis of critical services and assists with maintenance operations.

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Irvine, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship