Hadoop Administrator Resume OK - Hire IT People

SUMMARY

7+ years of professional experience in analysis, design, development, implementation, integration and testing of Client - Server applications using Object Oriented Analysis Design (OOAD) with 5+ years of experience in deploying, maintaining, monitoring and upgradingHadoopClusters. (ApacheHadoop, Cloudera)
Experience with managing, troubleshooting and security networks
Experience working with, extending, and enhancing monitoring systems like Nagios.
Storage experience with JBOD, NFS, SAN and RAID
Hands on experience usingHadoopecosystem components likeHadoopMap Reduce, HDFS, Zoo Keeper, Oozie, Hive, Sqoop, Pig.
Worked on Multi Clustered environment and setting up Cloudera Hadoopecho-System.
In-depth understanding of Data Structure and Algorithms.
Experience in setting up monitoring tools like Nagios and Ganglia forHadoop.
Experience in Importing and exporting data from different databases like SQL Server, Oracle into HDFS and Hive using Sqoop.
Experience in configuring Zookeeper to provide Cluster coordination services.
Experience in providing security forHadoopCluster with Kerberos.
Experience in setting up the High-AvailabilityHadoopClusters.
Experience in writing UDFs for Hive and Pig.
Experience in upgrading the existingHadoopcluster to latest releases.
Working experience of Amazon Web Services (AWS) like S3.
Experience in developing Shell Scripts for system management.
Setting up and maintaining NoSQL Databases like Cassandra, MongoDB.
Experience is deploying, Cassandra cluster (Apache Cassandra, Datastax).
Monitoring a Cassandra Cluster using OpsCenter
Ability to diagnose network problems
Understanding of TCP/IP networking and its security considerations
Excellent in communicating with clients, customers, managers, and other teams in the enterprise at all levels.
Effective problem solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines.
Motivated to produce robust, high-performance software.
Experience in Software Development Lifecycle (SDLC), application design, functional and technical specs, and use case development using UML.
Client interaction for requirement gathering/analysis and Testing.
Excellent programming skills in writing/maintaining, Oracle and PL/SQL procedures, triggers, cursors, views, and complex SQL Queries.
Ability to learn and use new technologies quickly.

TECHNICAL SKILLS

Bigdata Technologies: MapReduce, Hive, Pig, Zookeeper, Sqoop, Oozie, Flume, Hbase

Bigdata Frameworks: HDFS, YARN, Spark

Hadoop Distributions: Cloudera(CDH3, CDH4, CDH5), Hortonworks

Programming Langauges: Core Java, Shell Scripting, PowerShell

Operating Systems: Windows 98/2000/XP/Vista/NT/ 8.1, Redhat Linux/Centos 4, 5, Unix

Database: Oracle 10g/11g, T-SQL, MongoDB, PL/SQL

ETL Stack: Informatica 8x/9x, SSIS, SSRS, SSAS, IBM BigInsights

Business Modeling tools: UML, MS office, Remedy, Service Now, MS-Visio

PROFESSIONAL EXPERIENCE

Confidential, OK

Hadoop Administrator

Responsibilities:

Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
Provided Hadoop, OS, Hardware optimizations.
Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
Installed and configured Hadoop ecosystem components like MapReduce, Hive, Pig, Sqoop, HBase, ZooKeeper and Oozie.
Involved in testing HDFS, Hive, Pig and MapReduce access for the new users.
Cluster maintenance as well as creation and removal of nodes using Cloudera and Hortonworks Manager Enterprise.
Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
Implemented Fair scheduler on the job tracker to allocate fair amount of resources to small jobs.
Performed operating system installation, Hadoop version updates using automation tools.
Configured Oozie for workflow automation and coordination.
Implemented rack aware topology on the Hadoop cluster.
Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop.
Configured ZooKeeper to implement node coordination, in clustering support.
Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
Worked on developing scripts for performing benchmarking with Terasort/Teragen.
Implemented Kerberos Security Authentication protocol for existing cluster.
Good experience in troubleshoot production level issues in the cluster and its functionality.
Involved with statistical computing tools like IBM InfoSphere BigInsights for data visualization, sampling, analysis, model selection, estimator bias, calculated variance, and significance testing.
Backed up data on regular basis to a remote cluster using DistCp.
Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
Installed and maintain puppet-based configuration management system.
Deployed Puppet, Puppet Dashboard, and PuppetDB for configuration management to existing infrastructure.
Using Puppet configuration management to manage cluster.

Environment: Hadoop HDFS, Horton Works, Mapreduce, Hive, Pig, Oozie, Sqoop, Cloudera Manager

Confidential, FL

Hadoop Administrator

Responsibilities:

Hands on experience with Apache & Hortonworks Hadoop Ecosystem components such as Scoop, Hbase and Mapreduce.
Instillation/configuration and troubleshooting to Apache Hadoop cluster (48+nodes) for application development.
Installation, monitoring, managing, troubleshooting, applying patches in different environments such as Development Cluster, Test Cluster and Production environments.
Excellent understanding / knowledge ofHadooparchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce.
Experience in managingHadoopinfrastructure like commissioning, decommissioning, rack topology implementation.
Experience in managing the cluster resources by implementing fair scheduler and capacity scheduler.
Experience in Implementing High Availability of Name Node andHadoopCluster capacity planning
Developed automated scripts using Unix Shell for running Balancer, file system health check and User/Group creation on HDFS.
Experience in managing and reviewingHadooplog files.
Experience in upgradingHadoopcluster from current version to minor version upgrade as well as to major versions.
Experience in designing and building disaster recovery planning across the data centers to provide business continuity.
Experience in Complete development life (SDLC) cycle from gathering Requirements, Testing, Implementation and Post Implementation Support.
Expertise in design and implementation of Slowly Changing Dimensions (SCD) type1, type2 and type3.
Demonstrate and understanding of concepts, best practices and functions to implement a Big Data solution in a corporate environment.
Help design of scalable Big Data clusters and solutions.
Implemented Fair schedulers to share the resources of the cluster for the map reduce jobs given by the users.
Experience on writing automation scripts and setting up cron jobs to maintain cluster stability and healthy.
Monitoring and controlling local file system disk space usage, local log files, cleaning log files with automated scripts.
As a Hadoop admin, monitoring cluster health status on daily basis, tuning system performance related configuration parameters, backing up configuration XML files.
Work with Hadoop developers, designers in troubleshooting map reduce job failures and issues and helping to developers.
Work with network and Linux system engineers/admin to define optimum network configurations, server hardware and operating system.
Functioned with expert consulting resources around Hadoop integration points with any ETL, BI, and EDW teams
Evaluate and propose new tools and technologies to meet the needs of the organization.
Provided analytic support to marketing, finance, product development, fraud, customer service, engineering and operations using IBM InfoSphere BigInsights.
Production support responsibilities include cluster maintenance and on call support on weekly rotation 24/7.

Environment: RHEL, CentOS, Ubuntu, CDH3, Apache Hadoop, HDFS, Map, Reduce, Hbase, Shell Scripts

Confidential, IA

Hadoop Administrator

Responsibilities:

Provided and administered from the ground up - Apache Hadoop & Cloudera clusters for BI and other product/systems development.
Worked in the administration activities in providing installation, upgrades, patching and configuration for all hardware and software Hadoop components.
Work closely with other administrators (database, storage, Windows) and developers to keep the Hadoop environment running efficiently and effectively.
Configured and implemented Apache Hadoop technologies i.e, Hadoop distributed file system (HDFS), MapReduce framework, Pig, Hive, HCatalog, Sqoop, Flume.
Helped in setting up Rack topology in the cluster.
Helped in the day to day support for operation.
Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
Upgraded the Hadoop cluster from CDH3 to CDH4.
Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Implemented Kerberos for authenticating all the services in Hadoop Cluster.
Deployed Network file system for Name Node Metadata backup.
Dumped the data from MYSQL database to HDFS and vice-versa using SQOOP.
Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitoring workload, job performance and capacity planning using Apache Hadoop.
Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
Created a local YUM repository for installing and updating packages.
Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
Configured and deployed hive metastore using MySQL and thrift server.
Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment
Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
Worked with various tools and technologies to improve cluster performance, ETL & BI environments.
Designed and allocated HDFS quotas for multiple groups.
Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
Solutions design and implementation with NoSQL databases such as MongoDB, Cassandra.

Environment: Linux, HDFS, Pig, Hive, Sqoop, Mapreduce, Kdc, Nagios, Ganglia, Oozie, Sqoop, Apache Hadoop, Cloudera

Confidential

Hadoop Administrator

Responsibilities:

Worked on Administration, monitoring and fine tuning on an existing Cloudera Hadoop Cluster used by internal and external users as a Data and Analytics as a Service Platform.
Worked on Cloudera Cluster backup and recovery, performance monitoring, load balancing, rebalancing and tuning, capacity planning, and disk space management
Assisted in designing, development and architecture of HADOOP and HBase systems.
Coordinated with technical teams for installation of HADOOP and third related applications on systems.
Formulated procedures for planning and execution of system upgrades for all existing HADOOP clusters.
Supported technical team members for automation, installation and configuration tasks.
Suggested improvement processes for all process automation scripts and tasks.
Provided technical assistance for configuration, administration and monitoring of HADOOP clusters.
Conducted detailed analysis of system and application architecture components as per functional requirements.
Participated in evaluation and selection of new technologies to support system efficiency.
Assisted in creation of ETL processes for transformation of data sources from existing RDBMS systems.
Designed and developed scalable and custom HADOOP solutions as per dynamic data needs.
Coordinated with technical team for production deployment of software applications for maintenance.
Provided operational support services relating to HADOOP infrastructure and application installation.
Supported technical team members in management and review of HADOOP log files and data backups.
Participated in development and execution of system and disaster recovery processes.
Formulated procedures for installation of HADOOP patches, updates and version upgrades.
Automated processes for troubleshooting, resolution and tuning of HADOOP clusters.

Environment: Cloudera 4.X, Java, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Oozie and Hbase, ETL

Confidential

UNIX Administrator

Responsibilities:

Installing and upgrading OE & Red hat Linux and Solaris 8/ & SPARC on Servers like HP DL 380 G3, 4 and 5 & Dell Power Edge servers.
Experience in LDOM's and Creating sparse root and whole root zones and administered the zones for Web, Application and Database servers and worked on SMF on Solaris 5.10.
Experience working in AWS Cloud Environment like EC2 & EBS.
Implemented and administered VMware ESX 3.0, for running the Windows, Centos, SUSE and Red hat Linux Servers on development and test servers.
Installed and configured Apache on Linux and Solaris and configured Virtual hosts and applied SSL certificates.
Implemented Jumpstart on Solaris and Kick Start for Red hat environments.
Experience working with HP LVM and Red hat LVM.
Experience in implementing P2P and P2V migrations.
Involved in Installing and configuring Centos & SUSE 11 & 12 servers on HP x86 servers.
Implemented HA using Red hat Cluster and VERITAS Cluster Server 4.0 for Web Logic agent.
Managing DNS, NIS servers and troubleshooting the servers.
Troubleshooting application issues on Apache web servers and also database servers running on Linux and Solaris.
Experience in migrating Oracle, MYSQL data using Double take products.
Used Sun Volume Manager for Solaris and LVM on Linux & Solaris to create volumes with layouts like RAID 1, 5, 10, 15.
Re-compiling Linux kernel to remove services and applications that are not required.
Performed performance analysis using tools like prstat, mpstat, iostat, sar, vmstat, truss and Dtrace.
Experience working on LDAP user accounts and configuring LDAP on client machines.
Worked on patch management tools like Sun Update Manager.
Experience supporting middle ware servers running Apache, Tomcat and Java applications.
Worked on day to day administration tasks and resolve tickets using Remedy.
Used HP Service center and change management system for ticketing.

Environment: Redhat Linux/CentOS 4, 5, Logical Volume Manager, Hadoop, VMware ESX 3.0, Apache and Tomcat Web Server, Oracle 9g, Oracle RAC, HPSM, HPSA.

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

OK

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship