We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

OK

SUMMARY

  • 7+ years of professional experience in analysis, design, development, implementation, integration and testing of Client - Server applications using Object Oriented Analysis Design (OOAD) with 5+ years of experience in deploying, maintaining, monitoring and upgradingHadoopClusters. (ApacheHadoop, Cloudera)
  • Experience with managing, troubleshooting and security networks
  • Experience working with, extending, and enhancing monitoring systems like Nagios.
  • Storage experience with JBOD, NFS, SAN and RAID
  • Hands on experience usingHadoopecosystem components likeHadoopMap Reduce, HDFS, Zoo Keeper, Oozie, Hive, Sqoop, Pig.
  • Worked on Multi Clustered environment and setting up Cloudera Hadoopecho-System.
  • In-depth understanding of Data Structure and Algorithms.
  • Experience in setting up monitoring tools like Nagios and Ganglia forHadoop.
  • Experience in Importing and exporting data from different databases like SQL Server, Oracle into HDFS and Hive using Sqoop.
  • Experience in configuring Zookeeper to provide Cluster coordination services.
  • Experience in providing security forHadoopCluster with Kerberos.
  • Experience in setting up the High-AvailabilityHadoopClusters.
  • Experience in writing UDFs for Hive and Pig.
  • Experience in upgrading the existingHadoopcluster to latest releases.
  • Working experience of Amazon Web Services (AWS) like S3.
  • Experience in developing Shell Scripts for system management.
  • Setting up and maintaining NoSQL Databases like Cassandra, MongoDB.
  • Experience is deploying, Cassandra cluster (Apache Cassandra, Datastax).
  • Monitoring a Cassandra Cluster using OpsCenter
  • Ability to diagnose network problems
  • Understanding of TCP/IP networking and its security considerations
  • Excellent in communicating with clients, customers, managers, and other teams in the enterprise at all levels.
  • Effective problem solving skills and outstanding interpersonal skills. Ability to work independently as well as within a team environment. Driven to meet deadlines.
  • Motivated to produce robust, high-performance software.
  • Experience in Software Development Lifecycle (SDLC), application design, functional and technical specs, and use case development using UML.
  • Client interaction for requirement gathering/analysis and Testing.
  • Excellent programming skills in writing/maintaining, Oracle and PL/SQL procedures, triggers, cursors, views, and complex SQL Queries.
  • Ability to learn and use new technologies quickly.

TECHNICAL SKILLS

Bigdata Technologies: MapReduce, Hive, Pig, Zookeeper, Sqoop, Oozie, Flume, Hbase

Bigdata Frameworks: HDFS, YARN, Spark

Hadoop Distributions: Cloudera(CDH3, CDH4, CDH5), Hortonworks

Programming Langauges: Core Java, Shell Scripting, PowerShell

Operating Systems: Windows 98/2000/XP/Vista/NT/ 8.1, Redhat Linux/Centos 4, 5, Unix

Database: Oracle 10g/11g, T-SQL, MongoDB, PL/SQL

ETL Stack: Informatica 8x/9x, SSIS, SSRS, SSAS, IBM BigInsights

Business Modeling tools: UML, MS office, Remedy, Service Now, MS-Visio

PROFESSIONAL EXPERIENCE

Confidential, OK

Hadoop Administrator

Responsibilities:

  • Responsible for architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
  • Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
  • Provided Hadoop, OS, Hardware optimizations.
  • Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
  • Installed and configured Hadoop ecosystem components like MapReduce, Hive, Pig, Sqoop, HBase, ZooKeeper and Oozie.
  • Involved in testing HDFS, Hive, Pig and MapReduce access for the new users.
  • Cluster maintenance as well as creation and removal of nodes using Cloudera and Hortonworks Manager Enterprise.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
  • Implemented Fair scheduler on the job tracker to allocate fair amount of resources to small jobs.
  • Performed operating system installation, Hadoop version updates using automation tools.
  • Configured Oozie for workflow automation and coordination.
  • Implemented rack aware topology on the Hadoop cluster.
  • Importing and exporting structured data from different relational databases into HDFS and Hive using Sqoop.
  • Configured ZooKeeper to implement node coordination, in clustering support.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
  • Worked on developing scripts for performing benchmarking with Terasort/Teragen.
  • Implemented Kerberos Security Authentication protocol for existing cluster.
  • Good experience in troubleshoot production level issues in the cluster and its functionality.
  • Involved with statistical computing tools like IBM InfoSphere BigInsights for data visualization, sampling, analysis, model selection, estimator bias, calculated variance, and significance testing.
  • Backed up data on regular basis to a remote cluster using DistCp.
  • Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
  • Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.
  • Installed and maintain puppet-based configuration management system.
  • Deployed Puppet, Puppet Dashboard, and PuppetDB for configuration management to existing infrastructure.
  • Using Puppet configuration management to manage cluster.

Environment: Hadoop HDFS, Horton Works, Mapreduce, Hive, Pig, Oozie, Sqoop, Cloudera Manager

Confidential, FL

Hadoop Administrator

Responsibilities:

  • Hands on experience with Apache & Hortonworks Hadoop Ecosystem components such as Scoop, Hbase and Mapreduce.
  • Instillation/configuration and troubleshooting to Apache Hadoop cluster (48+nodes) for application development.
  • Installation, monitoring, managing, troubleshooting, applying patches in different environments such as Development Cluster, Test Cluster and Production environments.
  • Excellent understanding / knowledge ofHadooparchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce.
  • Experience in managingHadoopinfrastructure like commissioning, decommissioning, rack topology implementation.
  • Experience in managing the cluster resources by implementing fair scheduler and capacity scheduler.
  • Experience in Implementing High Availability of Name Node andHadoopCluster capacity planning
  • Developed automated scripts using Unix Shell for running Balancer, file system health check and User/Group creation on HDFS.
  • Experience in managing and reviewingHadooplog files.
  • Experience in upgradingHadoopcluster from current version to minor version upgrade as well as to major versions.
  • Experience in designing and building disaster recovery planning across the data centers to provide business continuity.
  • Experience in Complete development life (SDLC) cycle from gathering Requirements, Testing, Implementation and Post Implementation Support.
  • Expertise in design and implementation of Slowly Changing Dimensions (SCD) type1, type2 and type3.
  • Demonstrate and understanding of concepts, best practices and functions to implement a Big Data solution in a corporate environment.
  • Help design of scalable Big Data clusters and solutions.
  • Implemented Fair schedulers to share the resources of the cluster for the map reduce jobs given by the users.
  • Experience on writing automation scripts and setting up cron jobs to maintain cluster stability and healthy.
  • Monitoring and controlling local file system disk space usage, local log files, cleaning log files with automated scripts.
  • As a Hadoop admin, monitoring cluster health status on daily basis, tuning system performance related configuration parameters, backing up configuration XML files.
  • Work with Hadoop developers, designers in troubleshooting map reduce job failures and issues and helping to developers.
  • Work with network and Linux system engineers/admin to define optimum network configurations, server hardware and operating system.
  • Functioned with expert consulting resources around Hadoop integration points with any ETL, BI, and EDW teams
  • Evaluate and propose new tools and technologies to meet the needs of the organization.
  • Provided analytic support to marketing, finance, product development, fraud, customer service, engineering and operations using IBM InfoSphere BigInsights.
  • Production support responsibilities include cluster maintenance and on call support on weekly rotation 24/7.

Environment: RHEL, CentOS, Ubuntu, CDH3, Apache Hadoop, HDFS, Map, Reduce, Hbase, Shell Scripts

Confidential, IA

Hadoop Administrator

Responsibilities:

  • Provided and administered from the ground up - Apache Hadoop & Cloudera clusters for BI and other product/systems development.
  • Worked in the administration activities in providing installation, upgrades, patching and configuration for all hardware and software Hadoop components.
  • Work closely with other administrators (database, storage, Windows) and developers to keep the Hadoop environment running efficiently and effectively.
  • Configured and implemented Apache Hadoop technologies i.e, Hadoop distributed file system (HDFS), MapReduce framework, Pig, Hive, HCatalog, Sqoop, Flume.
  • Helped in setting up Rack topology in the cluster.
  • Helped in the day to day support for operation.
  • Worked on performing minor upgrade from CDH3-u4 to CDH3-u6
  • Upgraded the Hadoop cluster from CDH3 to CDH4.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for Name Node Metadata backup.
  • Dumped the data from MYSQL database to HDFS and vice-versa using SQOOP.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitoring workload, job performance and capacity planning using Apache Hadoop.
  • Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
  • Created a local YUM repository for installing and updating packages.
  • Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment
  • Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
  • Worked with various tools and technologies to improve cluster performance, ETL & BI environments.
  • Designed and allocated HDFS quotas for multiple groups.
  • Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH keyless login.
  • Solutions design and implementation with NoSQL databases such as MongoDB, Cassandra.

Environment: Linux, HDFS, Pig, Hive, Sqoop, Mapreduce, Kdc, Nagios, Ganglia, Oozie, Sqoop, Apache Hadoop, Cloudera

Confidential

Hadoop Administrator

Responsibilities:

  • Worked on Administration, monitoring and fine tuning on an existing Cloudera Hadoop Cluster used by internal and external users as a Data and Analytics as a Service Platform.
  • Worked on Cloudera Cluster backup and recovery, performance monitoring, load balancing, rebalancing and tuning, capacity planning, and disk space management
  • Assisted in designing, development and architecture of HADOOP and HBase systems.
  • Coordinated with technical teams for installation of HADOOP and third related applications on systems.
  • Formulated procedures for planning and execution of system upgrades for all existing HADOOP clusters.
  • Supported technical team members for automation, installation and configuration tasks.
  • Suggested improvement processes for all process automation scripts and tasks.
  • Provided technical assistance for configuration, administration and monitoring of HADOOP clusters.
  • Conducted detailed analysis of system and application architecture components as per functional requirements.
  • Participated in evaluation and selection of new technologies to support system efficiency.
  • Assisted in creation of ETL processes for transformation of data sources from existing RDBMS systems.
  • Designed and developed scalable and custom HADOOP solutions as per dynamic data needs.
  • Coordinated with technical team for production deployment of software applications for maintenance.
  • Provided operational support services relating to HADOOP infrastructure and application installation.
  • Supported technical team members in management and review of HADOOP log files and data backups.
  • Participated in development and execution of system and disaster recovery processes.
  • Formulated procedures for installation of HADOOP patches, updates and version upgrades.
  • Automated processes for troubleshooting, resolution and tuning of HADOOP clusters.

Environment: Cloudera 4.X, Java, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Oozie and Hbase, ETL

Confidential 

UNIX Administrator

Responsibilities:

  • Installing and upgrading OE & Red hat Linux and Solaris 8/ & SPARC on Servers like HP DL 380 G3, 4 and 5 & Dell Power Edge servers.
  • Experience in LDOM's and Creating sparse root and whole root zones and administered the zones for Web, Application and Database servers and worked on SMF on Solaris 5.10.
  • Experience working in AWS Cloud Environment like EC2 & EBS.
  • Implemented and administered VMware ESX 3.0, for running the Windows, Centos, SUSE and Red hat Linux Servers on development and test servers.
  • Installed and configured Apache on Linux and Solaris and configured Virtual hosts and applied SSL certificates.
  • Implemented Jumpstart on Solaris and Kick Start for Red hat environments.
  • Experience working with HP LVM and Red hat LVM.
  • Experience in implementing P2P and P2V migrations.
  • Involved in Installing and configuring Centos & SUSE 11 & 12 servers on HP x86 servers.
  • Implemented HA using Red hat Cluster and VERITAS Cluster Server 4.0 for Web Logic agent.
  • Managing DNS, NIS servers and troubleshooting the servers.
  • Troubleshooting application issues on Apache web servers and also database servers running on Linux and Solaris.
  • Experience in migrating Oracle, MYSQL data using Double take products.
  • Used Sun Volume Manager for Solaris and LVM on Linux & Solaris to create volumes with layouts like RAID 1, 5, 10, 15.
  • Re-compiling Linux kernel to remove services and applications that are not required.
  • Performed performance analysis using tools like prstat, mpstat, iostat, sar, vmstat, truss and Dtrace.
  • Experience working on LDAP user accounts and configuring LDAP on client machines.
  • Worked on patch management tools like Sun Update Manager.
  • Experience supporting middle ware servers running Apache, Tomcat and Java applications.
  • Worked on day to day administration tasks and resolve tickets using Remedy.
  • Used HP Service center and change management system for ticketing.

Environment: Redhat Linux/CentOS 4, 5, Logical Volume Manager, Hadoop, VMware ESX 3.0, Apache and Tomcat Web Server, Oracle 9g, Oracle RAC, HPSM, HPSA.

We'd love your feedback!