We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

4.00/5 (Submit Your Rating)

IowA

SUMMARY

  • Hadoop Consultant with 7+ years of overall IT experience in a variety of industries, which includes hands on experience in Big Data technologies with Data processing using Hadoop and its ecosystem (MapReduce, Pig, Hive, Spark, Kafka and HBase)
  • Responsible for day - to-day activities which includes HDFSsupport and maintenance, Cluster maintenance, creation/removal of nodes, Cluster Monitoring/ Troubleshooting, Manage and review Hadoop log files, Backup and restoring, capacity planning.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like HDFS, MapReduce, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, Impala and Flume.
  • Well versed with installation, configuration, managing and supporting Hadoop cluster using various distributions like Apache Hadoop, Cloudera-CDH and Hortonworks HDP.
  • Experience in performing minor and major Upgrades of Hadoop Cluster (Hortonworks Data Platform 1.7 to 2.1, CDH 5.5.5)
  • Worked on Containerization technology called Docker.
  • Implemented AWS solutions using EC2, S3, RDS, EBS, Elastic Load Balancer, Auto scaling groups, AWS CLI.
  • Experienced in installation, configuration, supporting and monitoring 100+ node Hadoop cluster using Cloudera manager and Hortonworks distributions.
  • Worked in production phase for migrating the application from puppet to chef.
  • Hands on experience in setting up the continuous integration tool Jenkins andBamboo.
  • Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components using Cloudera Manager and Hortonworks.
  • Experience in managing and reviewing Hadoop log files using Splunk and performed Hive tuning activities.
  • Experience in Benchmarking, Backup and Disaster Recovery of Name node Metadata.
  • Experience in working with Flume to load the log data from multiple sources directly into HDFS.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • Experience in Chef, Puppet or related tools for configuration management.
  • Experience in HDFS data storage and support for running map-reduce jobs.
  • Optimizing performance of Hbase/Hive/Pig jobs.
  • Experience on Designing, Planning, Administration, Installation, Configuring, Troubleshooting, performance monitoring and Fine-tuning of Cassandra cluster.
  • Experience in Ranger, Knox configuration to provide the security for Hadoop services (hive, base, HDFs etc.).

TECHNICAL SKILLS

Big Data Technologies: HDFS, Hive, Map Reduce, Cassandra, Pig, Hcatalog, Phoenix, Falcon, Scoop, Flume, Zookeeper, Mahout, Oozie, Avro, HBase, MapReduce, HDFS, Storm, CDH 5.3, CDH 5.5

Monitoring Tools: Cloudera Manager, Hortonworks, Ambari, Nagios, Ganglia

AWS Cloud Technologies: EC2, Elastic Beanstalk, SAAS, IAAS, Cloud Watch, Pivotal Cloud Foundry, S3, Route53, Redshift, DynamoDB, SQS, SES

Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH.

Programming Languages: C, Java, SQL, and PL/SQL.

Front End Technologies: HTML, XHTML, XML.

Application Servers: Apache Tomcat, WebLogic Server, Web sphere

Databases: Oracle 11g, MySQL, MS SQL Server, IBM DB2.

NoSQL Databases: HBase, Cassandra, MongoDB

Other Tools: Tableau, SAS Mails Servers and Clients Microsoft Exchange, Lotus Domino, Send mail, Postfix.

Operating Systems: Linux/ UNIX, MAC, Windows

Networks: HTTP, HTTPS, FTP, UDP, TCP/TP, SNMP, SMTP.

Security: Kerberos, Ranger.

PROFESSIONAL EXPERIENCE

Sr. Hadoop Administrator

Confidential, IOWA

Responsibilities:

  • Worked as Hadoop Admin and responsible for taking care of everything related to the clusters total of 100 nodes ranges from POC (Proof-of-Concept) to PROD clusters.
  • Worked as admin on Cloudera (CDH 5.5.2) distribution for clusters ranges from POC to PROD.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Adding/installation of new components and removal of them through Cloudera Manager.
  • Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
  • Level 2, 3 SME for current Big Data Clusters at the Client Site and set up standard troubleshooting technique.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations.
  • Imported logs from web servers with Flume to ingest the data into HDFS.
  • Using Flume and Spool directory loading the data from local system to HDFS.
  • Experience in Chef, Puppet or related tools for configuration management.
  • Retrieved data from HDFS into relational databases with Sqoop.
  • Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis Fine tuning hive jobs for optimized performance.
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Partitioned and queried the data in Hive for further analysis by the BI team.
  • Extending the functionality of Hive and Pig with custom UDF s and UDAF's.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop. .
  • Creating collections and configurations, Register a Lily HBase Indexer configuration with the Lily HBase Indexer Service.
  • Creating and truncating HBase tables in hue and taking backup of submitter ID(s).
  • Configuring, Managing permissions for the users in hue.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Troubleshooting, debugging & fixing Talend specific issues, while maintaining the health and performance of the ETL environment.
  • Create, modify and execute DDL and ETL scripts for De-normalized tables to load data into Hive and AWS Redshift tables.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: HDFS, Docker, Puppet, Map Reduce, Hive 1.1.0, Hue 3.9.0, Pig, Flume, Oozie, Sqoop, CDH5, Apache Hadoop 2.6, Spark, AWS, SOLR, Storm, Knox, Cloudera Manager, Red Hat, MySQL and Oracle.

Hadoop Administrator

Confidential, NYC

Responsibilities:

  • Capturing data from existing databases that provide SQL interfaces using Sqoop.
  • Implemented Hadoop stack and different big data analytic tools, migration from different databases to Hadoop.
  • Processed information from Hadoop HDFS. This information will comprise of various useful insights that can be used in the decision making process. All these insights will be presented to the users in the form of Charts.
  • Working on different Big Data technologies, good knowledge of Hadoop, Map-Reduce, Hive.
  • Developed various POCs over Hadoop, Big data.
  • Worked on deployments and automation task.
  • Migrated Cassandra, Hadoop cluster on AWS and defined different read/write strategies for geographies.
  • Installed and configured Hadoop cluster in pseudo and fully distributed mode environments.
  • Involved in developing the data loading and extraction processes for big data analysis.
  • Worked on professional services engagements to help customers design, build clusters, applications, troubleshoot network, disk and operating system related issues.
  • Worked with Puppet for automated deployments.
  • Installed and configured local Hadoop Cluster with 3 nodes and set up 4 nodes cluster on EC2 cloud.
  • Written MapReduce code to process and parsing the data from various sources and storing parsed data into HBase and Hive using HBase-Hive Integration.
  • Work with HBase and Hive scripts to extract, transform and load the data into HBase and Hive.
  • Continuous monitoring and managing of the Hadoop cluster.
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Developing scripts and batch job to schedule a bundle (group of coordinators) which consists of various Hadoop programs using Oozie.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.

Environment: Hadoop 2.5.2, HDFS, Map Reduce, AWS, Hive, Docker, Puppet, Flume, Sqoop, Cloudera CDH4, HBase, Oozie, Pig, AWS EC2 cloud.

Linux and Hadoop Administrator

Confidential

Responsibilities:

  • Client wanted to migrate from In-Premise cluster to the Confidential Web services Cloud (AWS), hence Built a Production and QA Cluster with the latest distribution of Hortonworks - HDP stack managed by Ambari on AWS Cloud.
  • Installed and configured Hortonworks Distribution Platform (HDP 2.3) on Confidential EC2 instances.
  • Managing and reviewing Hadoop and HBase log files
  • Proven results-oriented person with a focus on delivery
  • Built and configured log data loading into HDFS using Flume.
  • Hands on experience installing, configuring, administering, debugging and troubleshooting Apache and Datastax Cassandra clusters.
  • Maintain and monitor all system frameworks and provide after call support to all systems and maintain optimal Linux knowledge.
  • Assisted developers with troubleshooting custom software, and services such as ActiveSync, CalDav, CardDav, and PHP.
  • Top level customer service and implementation for DKIM, SPF, and custom SSL/TLS security.
  • Maintained Solaris server hardware and performed basic troubleshooting on database problems and initiated necessary steps to fixing any found errors utilizing shell scripts.
  • Served as Project lead on updating hardware and software for the backup schema on both Windows and UNIX/ LINUX based development networks.

Environment: Hadoop, HDP 2.3, HDFS, Hortonworks, Map Reduce, Impala, Sqoop, HBase, Hive, Flume, Oozie, Zoo keeper, solr, Performance tuning, AWS, cluster health, monitoring security, Shell Scripting, NoSQL/HBase/Cassandra.

We'd love your feedback!