We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

2.00/5 (Submit Your Rating)

Battle Creek, MichigaN

SUMMARY:

  • Over 7+ years of experience with 6+ years of experience of working actively as an Hadoop Administrator.
  • Enthusiastic analyst with proven ability in building productive working relationships with clients.
  • Possess strong analytical & technical skills.
  • Experienced in various phases of Software Development Life Cycle (Analysis, Requirements gathering and Designing) with expertise in documenting various requirement specifications and functional specifications.
  • Experience in installation, configuration, supporting and managing Hadoop clusters.
  • Well versed in configuring and administering the Hadoop Cluster using Hadoop Distributions like Hortonworks.
  • Have good knowledge on creating Blueprints from existing cluster and building a new cluster.
  • Experience in managing and reviewing Hadoop log files.
  • Hands on experience in creating Hbase tables and maintaining them.
  • Have experience in working with EC2 and S3 instances on Amazon.
  • Have Basic Knowledge of Puppet management with Stash and SVN access.
  • Have Good knowledge on kafka and storm Topologies.
  • Over Three plus years of experience in design, development, maintenance and support of Big Data Analytics using Hadoop components like HDFS, Hive, Hbase, Sqoop, Flume, Zookeeper, MapReduce, kafka, Storm and Knox.
  • Hands on experience in setting up Hadoop clusters with multiple nodes.
  • Utilized frameworks and extensions to Hadoop such as Cascading and Hive.
  • Extensive Experience on working with Hadoop Architecture and the components of Hadoop - MapReduce, HDFS, Resource manager, Node Manager, NameNode and DataNode.
  • Basic knowledge in writing Hadoop Jobs for analyzing data using Hive and Pig.
  • Loaded streaming log data from various webservers into HDFS using Flume.
  • Successfully loaded files to Hive and HDFS from Oracle and SQL Server using SQOOP.
  • Experience with SQL, PL/SQL and Relational database concepts.
  • Involved in the Query optimization and Performance tuning, and also tuned SQL queries using Show plans and Execution plans.
  • Experienced in working with version control tools like SVN and GIT Hub.
  • Load and transform large sets of structured, semi-structured and unstructured data using Hadoop ecosystem components.
  • Experience in working with different data sources like Flat files, XML files and Databases.
  • Experience in developing solutions to analyze large data sets efficiently.
  • Strong knowledge of Object Oriented Programming (OOP) concepts.
  • Hands on Experience as Configuration Manager for SVN Server.
  • Familiar with Java virtual machine (JVM) and multi-threaded processing
  • Good Experience in Struts 1.0 and 2.0, Spring and Hibernate Framework along with Agile Application tool knowledge.
  • Familiarity with Linux, J2EE, HTML, Hibernate, and Spring MVC

KEY SKILLS:

Database: Oracle, Mysql, SQL Server, SAP S/4, Nosql.

Programming Language: Java, Python, C++, Sql, Php, Linux Shell Scripts, C

Cloud Based Services: Aws, Azure

Project Management Processes: Waterfall, Agile (Scrum, Kanban), Scrum, Six Sigma

Web Technologies: Quality Center, Load Runner, Quick Test Pro, Bugzilla, Html, Xml, Javascript

Sql Tools: Sql Server, Oracle, Ms Access

Operating Systems: Unix, Linux, Windows Vista/Xp/2000/7/8, Centos, Solaris

Languages: C, C++, JAVA/J2EE, Linux Shell Scripts

Bigdata Ecosystem: Hdfs, Pig, Mapreduce, Hive, Sqoop, Flume, Zookeeper, Kafka, Storm, Knox, Hbase

WORK EXPERIENCE:

Sr. Hadoop Administrator

Confidential, Battle Creek, Michigan

Responsibilities:

  • Administration & Monitoring Hadoop.
  • Worked on Hadoop Upgradation from 4.5 to 5.2.
  • Monitor Hadoop cluster job performance and capacity planning
  • Removing from monitoring of particular security group nodes in nagios in case of retirement
  • Responsible for managing and scheduling jobs on Hadoop Cluster
  • Replacement of Retired Hadoop slave nodes through AWS console and Nagios Repositories
  • Performed dynamic updates of Hadoop Yarn and MapReduce memory settings
  • Worked with DBA team to migrate Hive and Oozie meta store Database from MySQL to RDS
  • Worked with fair and capacity schedulers, creating new queues, adding users to queue, Increase mapper and reducers capacity and also administer view and submit Mapreduce jobs
  • Experience in Administration/Maintenance of source control management systems, such as GIT and GITHUB knowledge
  • Hands on experience in installing and administrating CI tools like Jenkins
  • Experience in integrating Shell scripts using Jenkins
  • Installed and configured an automated tool Puppet that included the installation and configuration of the Puppet master, agent nodes and an admin control workstation.
  • Working with Modules, Classes, Manifests in Puppet
  • Experience in creating Docker images
  • Used containerization technologies like Docker for building clusters for orchestrating containers deployment.
  • Operations - Custom Shell scripts, VM and Environment management.
  • Experience in working with Amazon EC2, S3, Glaciers
  • Create multiple groups and set permission polices for various groups in AWS
  • Experience in creating life cycle policies in AWS S3 for backups to Glaciers
  • Experience in maintaining, executing, and scheduling build scripts to automate DEV/PROD builds.
  • Configured Elastic Load Balancers with EC2 Auto scaling groups.
  • Created monitors, alarms and notifications for EC2 hosts using Cloudwatch.
  • Launching Amazon EC2 Cloud Instances using Amazon Images (Linux/Ubuntu) and configuring launched instances with respect to specific applications
  • Worked with IAM service creating new IAM users & groups, defining roles and policies and Identity providers
  • Experience in assigning MFA in AWS using IAM and s3 buckets
  • Defined AWS Security Groups which acted as virtual firewalls that controlled the traffic allowed to reach one or more AWS EC2 instances.
  • AmazonRoute53 to oversee DNS zones and furthermore give open DNS names to flexible load balancers IP.
  • Using default and custom VPCs to create private cloud environments with public and private subnets
  • Loaded data from Oracle, MS SQL Server, MySQL, Flat File database into HDFS, HIVE
  • Fixed Namenode partition failed, fsimage not rotated, MR job failed with too many fetch failures and troubleshooting common Hadoop cluster issues
  • Implemented manifest files in puppet for automated orchestration of Hadoop and Cassandra clusters
  • Maintaining Github repositories for Configuration Management
  • Configured distributed monitoring system Ganglia for Hadoop clusters
  • Managing cluster coordination services through Zoo Keeper
  • Configured and deployed Namenode High Availability Hadoop cluster with SSL and kerberoized
  • Deal with the several services restart and killing the process with Pid to clear the alert
  • Monitoring Log files of several services, clear files incase of Diskspace issues on share this nodes
  • 24X7 production support for weekly schedule with Ops team

Hadoop Administrator

Confidential, Chicago, IL

Responsibilities:

  • Installed Hadoop Ecosystem components on 100 nodes (production) cluster.
  • Deployed Lower Environment Dev and Test boxes using Blueprints.
  • Installed single Node machines for stake holders with Hortonworks HDP Distribution.
  • Expert in Troubleshooting Hadoop services and component related issues.
  • Duties included Troubleshoot and fixing problems in production and LE clusters.
  • Managed all the user tickets related to cluster issues and handled them with perfection within SLA.
  • Configured Flume to pull SYSLOG logs from individual component services into HDFS.
  • Handled issues during Blueprint implementation with ease and provided solution in very less time.
  • Have basic knowledge on Ranger for Horton works.
  • Very good Knowledge in deploying Blueprint on multinode cluster and single node cluster.
  • Deployed Knox gateway and storm topologies in Production and LE boxes.
  • Handled Namenode related issues and helped prepare a process to move Namenode in a cluster with Kerberos.
  • Handled Namenode inconsistencies and fixed Cluster ID mismatch in the cluster.
  • Involved in handling Kafka Index corruption issue in Production.
  • Helped in Re-imaging of Nodes from Safenet file system to Lux to avoid Kafka log file corruption.
  • Defined the Retention period for REST logs which were piled up and causing Tcserver breakdown and fixed it.
  • Helped fix Load balancing issue with Knox and Rest Layer.
  • Integrated Hadoop with user authentication using Kerberos authentication protocol.
  • Created Hbase table schemas as required by end users in LE boxes.
  • Provided folder access to users under user directory in HDFS to provide access to run Map Reduce jobs.
  • Handled all the ownership and permission issues created by end users.
  • Used to manage and review the Hadoop log files.
  • Hands on experience with JIRA Issue Tracker which is used for tracking various Production Issues and Incident management.
  • Performed POC’s to install new services in Lower environment machines using EMR.

Hadoop Adminstrator

Confidential, Cedar Rapids, Lowa

Responsibilities:

  • Installed and configured CDH5.0.0 cluster, using Cloudera Manager.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Monitored workload, job performance and capacity planning.
  • Managed and reviewed Hadoop log files and debugged failed jobs.
  • Tuned clusters by Commissioning and decommissioning the Data Nodes.
  • Supported cluster maintenance, backup and recovery for production cluster.
  • Supported data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud.
  • Fine-tuned Hive jobs for better performance.
  • Automated all the jobs for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
  • Collected and aggregated large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
  • Implemented Fair scheduler and capacity scheduler to allocate fair amount of resources to small jobs.
  • Installed, Configured and Maintained the Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, HBase, Zookeeper and Sqoop.
  • In depth understanding of Hadoop Architecture and various components such as HDFS, Name Node, Data Node, Resource Manager, Node Manager and YARN / Map Reduce programming paradigm.
  • Monitoring Hadoop Cluster through Cloudera Manager and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics and Charge Back customers on their Usage.
  • Extensively worked on commissioning and decommissioning of cluster nodes, file system integrity checks and maintaining cluster data replication.
  • Very good understanding and knowledge of assigning number of mappers and reducers to Map reduce cluster.
  • Setting up HDFS Quotas to enforce the fair share of computing resources.
  • Strong Knowledge in Configuring and maintaining YARN Schedulers (Fair, and Capacity)
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Explicit support for partitioning messages over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics.
  • Support for parallel data load into Hadoop.
  • Involved in setting up HBase which includes master and region server configuration, High availability configuration, performance tuning and administration.
  • Created user accounts and provided access to the Hadoop cluster.
  • Upgraded cluster from CDH 5.3 to CDH 5.7 and Cloudera manager from CM 5.3 to 5.7.
  • Involved in loading data from UNIX file system to HDFS.

Confidential

Jr. Hadoop Administrator

Responsibilities:

  • Monitoring /Debugging/Troubleshooting Hadoop jobs, Applications running in production.
  • Installed and configured Hive, Pig, Sqoop, Flume, Cloudera manager and Oozie on the Hadoop cluster.
  • Worked on Providing User support and application support on Hadoop Infrastructure.
  • Worked on Evaluating, comparing different tools for test data management with Hadoop.
  • Developed Map Reduce Input format to read specific data format.
  • Moved data from HDFS to Cassandra using Map Reduce and BulkOutputFormat class.
  • Extensive knowledge in writing Map-Reduce programs in Java and Cascading.
  • Developed Map-Reduce programs to clean and aggregate the data. We use cascading framework for the latest map-reduce jobs.
  • Performance tuning of Hive Queries written by data analysts.
  • Developing Hive queries and UDF’s as per requirement.
  • Used Sqoop to efficiently Transfer data from DB2 to HDFS, Oracle Exadata to HDFS.
  • Worked on implementing Hadoop Streaming, Python MapReduce for visa analytics.
  • Designed and Developed Oozie workflows, integration with Pig.
  • Worked on Configuring the New Hadoop Cluster.
  • Responsible to manage data coming from different sources into HDFS through Sqoop, Flume.
  • Troubleshooting and monitoring Hadoop services using Cloudera manager.
  • Monitoring and tuning Map Reduce Programs running on the cluster.
  • Involved in HDFS maintenance and loading of structured and unstructured data.
  • Involved in migration of data from Hadoop Cluster to the Hadoop Cluster.
  • Developed several Map Reduce Programs for data preprocessing.
  • Prepared System Design document with all functional implementations.
  • Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MapReduce, HIVE, SQOOP and Pig Latin.
  • Extensive hands on experience in Hadoop file system commands for file handling operations.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
  • Worked with parsing XML files using Map reduce to extract sales related attributed and store it in HDFS.

Technical Analyst

Confidential

Responsibilities:

  • Performed daily reporting of software deployments, network connectivity of POS devices, sales and refund data, and other POS specific data required.
  • Created and managed reporting tools for sales data, data analysis, and system configurations using SQL reporting and Access database.
  • Managed and deployed software to endpoints.
  • Managed parameter data such as tax rates, departments, SKUs, bottle deposits, store credit card application software and other POS specific data.
  • Developed workarounds to allow POS systems to perform nonstandard operations to support sales tax holidays and other locale specific requirements where POS application cannot natively support.
  • Served as point of escalation to company service desk for complex issues.
  • Reported daily activities to senior management and project stakeholders.
  • Worked with project teams to establish requirements, time estimates, and develop solutions.
  • Document new software packages and operating standards.

We'd love your feedback!