We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

0/5 (Submit Your Rating)

Hillsboro, OR

SUMMARY

  • Over 8+ years of professional Information Technology experience in Hadoopand Linux Administration activities such as installation, configuration and maintenance of systems/clusters.
  • Having 3 years of experience in Linux Administration & Java Technologies and 5+ years of experience into Hadoop administration.
  • Hands on experience on HadoopClusters using Horton works (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
  • Possessing skills in Apache Hadoop, Map - Reduce, Pig, Impala, Hive, Platfora, HBase, Zookeeper, Sqoop, Flume, OOZIE, and Kafka, storm, Spark, Datameer, Java Script, and J2EE.
  • Experience in deploying and managing the multi-node development and production Hadoopcluster with different Hadoop components (Hive, Pig, Sqoop, Oozie, Flume, Catalog, HBase, Zookeeper) using Horton works Ambari.
  • Good experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL, and DB2.
  • Experience in Configuring Name-node High availability and Name-node Federation and depth knowledge on Zookeeper for cluster coordination services.
  • Experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
  • Experience in administering Tableau and Green Plum databases instances in various environments.
  • Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
  • Extensive knowledge in Tableau on Enterprise Environment and Tableau administrationexperience including technical support, troubleshooting, reporting and monitoring of system usage.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
  • Worked on NoSQL databases including HBase, Cassandra and MongoDB.
  • Designing and implementing security for Hadoop cluster with Kerberos secure authentication.
  • Hands on experience on Nagios and Ganglia tool for cluster monitoring system.
  • Experience in scheduling all Hadoop/Hive/Sqoop/HBase jobs using Oozie.
  • Knowledge of Data Ware Housing concepts and Cognos 8 BI Suit and Business Objects.
  • Experience in HDFS data storage and support for running map-reduce jobs.
  • Experience in Installing Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems.
  • Expert in Linux Performance monitoring, kernel tuning, Load balancing, health checks and maintaining compliance with specifications.
  • Hands on experience in Zookeeper and ZKFC in managing and configuring in Name node failure scenarios.
  • Team Player with good communication and interpersonal skills and also goal oriented approach to problem solving issues.

TECHNICAL SKILLS

Big Data Technologies: Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, HBase, Flume, Oozie, Spark, Zookeeper.

Hadoop Platforms: Horton works and Cloudera, Apache Hadoop

Networking Concepts: OSI Model, TCP/IP, UDP, IPV4, Subnetting, DHCP & DNS

Programming Languages: PIG LATIN, UNIX shell scripting and Bash.

Operating Systems: Linux (CentOS, Ubuntu, Red Hat), Windows, UNIX and Mac OS-X

Database/ETL: Oracle, Cassandra, DB2, MS-SQL Server, MySQL, MS-Access, HBase, MongoDB, Informatica, Teradata.

XML Languages: XML, DTD, XML Schema, XPath.

Monitoring and Alerting: Nagios, Ganglia, Cloudera Manager, Ambari.

PROFESSIONAL EXPERIENCE

Confidential - Atlanta, GA

Sr. Hadoop Administrator

Responsibilities:

  • Currently working as admin in Horton works (HDP 2.2.4.2) distribution for 4 clusters ranges from POC to PROD.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage &review log files.
  • Experienced in Ambari-alerts configuration for various components and managing the alerts.
  • Experienced on adding/installation of new components and removal of them through Ambari.
  • Monitoring systems and services through Ambari dashboard to make the clusters available for the business.
  • Experienced in Setting up the project and volume setups for the new projects.
  • Experienced in managing and reviewing log files.
  • Hand on experience on cluster up gradation and patch upgrade without any data loss and with proper backup plans.
  • Working experience on maintaining MySQL databases creation and setting up the users and maintain the backup of cluster metadata databases with corn jobs.
  • Architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.
  • Changing the configurations based on the requirements of the users for the better performance of the jobs.
  • Provided security and authentication with ranger where ranger admin provides administration and user sync adds the new users to the cluster.
  • Good troubleshooting skills on Hue, which provides GUI for developer's/business users for day to day activities.
  • Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Implemented complex Map Reduce programs to perform joins on the Map side using distributed cache
  • Setup flume for different sources to bring the log messages from outside to Hadoop HDFS.
  • Implemented Name Node HA in all environments to provide high availability of clusters.
  • Capacity scheduler implementation in all environments to provide resources based on the allocation.
  • Create queues and allocated the clusters resources to provide the priority for jobs.
  • Involved in snapshots and mirroring to maintain the backup of cluster data and even remotely.
  • Implementing the SFTP for the projects to transfer data from External servers to servers.
  • Setting up MySQL master and slave replications and helping business applications to maintain their data in MySQL Servers.
  • Helping the users in production deployments throughout the process.
  • Experienced in production support which involves solving the user incidents varies from sev1 to sev5.
  • Managed and reviewed Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • As an admin followed standard Back up policies to make sure the high availability of cluster.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Documented the systems processes and procedures for future references.
  • Worked with systems engineering team to plan and deploy new environments and expand existing clusters.
  • Monitored multiple clusters environments using AMBARI Alerts, Metrics and Nagios.

Environment: Hadoop HDFS, Map reduce, Hive, Pig, Flume, Oozie, Sqoop, Eclipse, Horton works, Ambari.

Confidential, Hillsboro, OR

Sr. Hadoop Administrator

Responsibilities:

  • Understood the existing Enterprise data warehouse set up and provided design and architecture suggestion converting toHadoop ecosystem.
  • Set up automated 24x7x365 monitoring and escalation infrastructure for Hadoop cluster using Nagios Core and Ambari.
  • Designed and implemented Disaster Recovery Plan forHadoopClusters.
  • Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing Zookeeper services.
  • IntegratedHadoop clusterwith Active Directory and enabled Kerberos for Authentication.
  • Implemented Capacity schedulers on the Yarn Resource Manager to share the resources of the cluster for the MapReduce jobs given by the users.
  • Set up Linux Users, and tested HDFS, Hive, Pig and MapReduce Access for the new users.
  • Monitored Hadoop Jobs and Reviewed Logs of the failed jobs to debug the issues based on the errors.
  • OptimizedHadoopclusters components: HDFS, Yarn, Hive, Kafka to achieve high performance.
  • Worked with Linux server admin team in administering the server Hardware and operating system.
  • Interacted with Networking team to improve bandwidth.
  • Provided User, Platform and Application support onHadoopInfrastructure.
  • Applied Patches and Bug Fixes onHadoopCluster.
  • Proactively involved in ongoing Maintenance, Support and Improvements in Hadoop clusters.
  • Conducted Root Cause Analysis and resolved production problems and data issues.
  • Performed Disk Space management to the users and groups in the cluster.
  • Added Nodes to the cluster and Decommissioned nodes from the cluster whenever required.
  • Managed cluster operations Remotely.
  • Performed Backup and Recovery process in order to Upgrade Hadoop stack.
  • Used Sqoop, Distcp utilities for data copying and for data migration.
  • Installed Kafka cluster with separate nodes for brokers.
  • Performed Kafka operations on regular basis.
  • Proactively involved in ongoing Maintenance, Support and Improvements in Hadoop clusters.
  • Monitored cluster stability, used tools to gather statistics and improved performance.
  • Used Apache(TM)Tez, an extensible framework for building high performance batch and interactive data processing applications, on Pig and Hive jobs.

Environment: RedHat OS 6.7, HDP(2.3), HDFS, Map Reduce, Tez, YARN, Hive, HBase, Sqoop, Oozie, Zookeeper, Ambari, Nagios core, Nagios Log Server, Kafka, Spark, Storm, Kerberos, Ranger, Teradata, Oracle, Tidal, Toad.

Confidential, Atlanta, GA

Hadoop Administrator

Responsibilities:

  • Installed, configured and maintained Apache Hadoopclusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Managing and scheduling Jobs on HadoopClusters using Apache, Cloudera (CDH3, CDH4) distributions.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS using Sqoop.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Deployed HadoopCluster in Pseudo-distributed and Fully Distributed.
  • Implemented Name Node backup using NFS. This was done for High availability.
  • Created Hive External tables and loaded the data in to tables and query data using HQL.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Configured custom interceptors in Flume agents for replicating and multiplexing data into multiple sinks.
  • Worked on NoSQL databases including HBase and MongoDB.
  • Good experience in analysis using PIG and HIVE and understanding of SQOOP and Puppet.
  • Setting up automated 24*7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
  • Extensive experience in data analysis using tools like Sync sort and HZ along with Shell Scripting and UNIX.
  • Handle the data exchange between HDFS& Web Applications and databases using Flume and Sqoop.
  • Experienced in developing MapReduce programs using Apache Hadoopfor working with Big Data.
  • Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP.
  • Familiarity and experience with data warehousing and ETL tools.
  • Good understanding of Scrum methodologies, Test Driven Development and continuous integration.
  • Involved in log file management where the logs greater than 7 days old were removed from log folder and loaded into HDFS and stored for 3 months.

Environment: Cloudera, CDH 4.4, and CDH 3, Cloudera manager, Sqoop, Flume, Hive, HQL, Pig, RHEL, Cent OS, Oracle, MS-SQL, Zookeeper, Oozie, MapReduce, Apache Hadoop 1.x, PostgreSQL, Ganglia and Nagios.

Confidential

Linux/ Hadoop Administrator

Responsibilities:

  • Installation and configuration of Linux for new build environment.
  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
  • Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Monitoring the System activity, Performance, Resource utilization.
  • Develop and optimize physical design of MySQL database systems.
  • Automate administration tasks through use of scripting and Job Scheduling using CRON.
  • Tuned YARN components to achieve high performance for MapReduce jobs
  • Implemented HDFS snapshot feature.
  • Deep understanding of monitoring and troubleshooting mission critical Linux machines.
  • Created Virtual server on Citrix Xen Server based host and installed operating system on Guest Servers.
  • Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents.
  • Performed all System administration tasks like Cron jobs, installing packages, and patches.
  • Extensive use of LVM, creating Volume Groups, Logical volumes.
  • Performed RPM and YUM package installations, patch and other server management.
  • Performed scheduled backup and necessary restoration.
  • Performed configuration and troubleshooting of services like NFS, NIS, NIS+, DHCP, FTP, LDAP, Apache Web servers.
  • Managed critical bundles and patches on the production servers after successfully navigating through the testing phase in the test environments.
  • Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
  • Updating YUM Repository and Red hat Package Manager (RPM).
  • Configured Domain Name System (DNS) for hostname to IP resolution.
  • Preparation of operational testing scripts for Log check, Backup and recovery and Failover.
  • Troubleshooting and fixing the issues at User level, System level and Network level by using various tools and utilities.
  • Schedule backup jobs by implementing cron job schedule during non-business hour.

Environment: Yum, Raid, Mysql 5.1.4, Php, Shell Script, Mysql, Workbench, Linux 5.0, 5.1, Yum, Raid.

Confidential

Java/J2EE Developer

Responsibilities:

  • Involved in client requirement gathering, analysis & application design.
  • Used UML to draw use case diagrams, class & sequence diagrams.
  • Implemented client side data validations using JavaScript.
  • Implemented server side data validations using Java Beans.
  • Implemented views using JSP & JSTL1.0.
  • Developed Business Logic using Session Beans.
  • Implemented Entity Beans for Object Relational mapping.
  • Implemented Service Locater Pattern using local caching.
  • Worked with collections.
  • Implemented Session Facade Pattern using Session and Entity Beans
  • Developed message driven beans to listen to JMS.
  • Performed application level logging using log4j for debugging purpose.
  • Involved in fine-tuning of application.
  • Thoroughly involved in testing phase and implemented test cases using Junit.
  • Involved in the development of Entity Relationship Diagrams using Rational Data Modeler.

Environment: Java SDK 1.4, Entity Bean, Session Bean, JSP, Servlet, JSTL1.0, CVS, JavaScript, and Oracle9i, SQL, JBOSSv3.0, Eclipse 2.1

We'd love your feedback!