We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

0/5 (Submit Your Rating)

Oak Brook, IL

SUMMARY

  • Around 8 years of professional experience including 3+ years of Hadoop Administration and 4+ years as Linux Admin.
  • Responsible for day - to-day activities which includes HDFS support and maintenance, Cluster maintenance, creation/removal of nodes, Cluster Monitoring/ Troubleshooting, Manage and review Hadoop log files, Backup and restoring, capacity planning.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like HDFS, MapReduce, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, Impala and Flume.
  • Worked wif Hadoop developers and operating system admins in designing scalable supportable infrastructure for Hadoop.
  • Well versed wif installation, configuration, managing and supporting Hadoop cluster using various distributions like Apache Hadoop, Cloudera-CDH and Hortonworks HDP.
  • Experience in performing minor and major Upgrades of Hadoop Cluster (Hortonworks Data Platform 1.7 to 2.1, CDH 5.5.5)
  • Very strong experience working wif Ansible playbooks and Ansible tower for automating the tasks to execute across the cluster.
  • Experienced in installation, configuration, supporting and monitoring 100+ node Hadoop cluster using Cloudera manager and Hortonworks distributions.
  • Experience in deploying and managing the multi-node development, testing and production Hadoop cluster wif different Hadoop components using Cloudera Manager and Hortonworks.
  • Experience in managing and reviewing Hadoop log files using Splunk and performed Hive tuning activities.
  • Experience in providing security for Hadoop Cluster wif Kerberos.
  • Experience in Benchmarking, Backup and Disaster Recovery of Name node Metadata.
  • Experience in working wif Flume to load the log data from multiple sources directly into HDFS.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • Experience in Chef, Puppet or related tools for configuration management.
  • Experience in HDFS data storage and support for running map-reduce jobs.
  • Optimizing performance of Hbase/Hive/Pig jobs.
  • Experience on Designing, Planning, Administration, Installation, Configuring, Troubleshooting, performance monitoring and Fine-tuning of Cassandra cluster.
  • Good knowledge of Cassandra cluster topology and Virtual Nodes.
  • Involved in Infrastructure set up and installation of Cloudera stack on Amazon Cloud
  • Experience wif ingesting data from RDBMS sources like - Oracle, SQL and Teradata into HDFS using Sqoop.
  • Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Experience in designing and implementing HDFS access controls, directory and file permissions user authorization that facilitates stable, secure access for multiple users in a large multi-tenant cluster
  • Familiar wif importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.

TECHNICAL SKILLS

Big Data Technologies: HDFS, Hive, Map Reduce, Cassandra, Pig, Hcatalog, Phoenix, Falcon, Scoop, Flume, Zookeeper, Mahout, Oozie, Avro, HBase, MapReduce, HDFS, and Storm.

Hadoop Distribution: Hortonworks, Cloudera.

Monitoring Tools: Cloudera Manager, Ambari, Nagios, Ganglia

Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH.

Programming Languages: C, Java, SQL, and PL/SQL.

Front End Technologies: HTML, XHTML, XML.

Application Servers: Apache Tomcat, WebLogic Server, Web sphere

Databases: Oracle 11g, MySQL, MS SQL Server, IBM DB2.

NoSQL Databases: HBase, Cassandra, MongoDB

Operating Systems: Linux, UNIX, MAC, Windows NT / 98 /2000/ XP / Vista, Windows 7, Windows 8.

Networks: HTTP, HTTPS, FTP, UDP, TCP/TP, SNMP, SMTP.

Security: Kerberos, Ranger.

PROFESSIONAL EXPERIENCE

Sr. Hadoop Administrator

Confidential, Oak Brook, Il

Responsibilities:

  • Sustain, improve, enhance support, and configure companies' Big Data Ecosystem.
  • Configured various Hadoop Stack components and tool for seamless integration.
  • Monitoring components health and check functionality on regular basis to weed out any potential issues.
  • Upgraded and configured changes of all Big Data cluster components as per the requirement.
  • Configured/installed/suggested new services to enhance the cluster and user experience.
  • Resolving cluster related issues.
  • Orchestrate movement of data like data ingestion, analysis, storage, retrieval and movement between various services and tools listed above.
  • Keeping, services available, efficient for all data intensive analytics, and warehousing.
  • Created access policies to be enforced for all system interactions.
  • Managing user access to each component of the environment as required.
  • Auditing both user and data related activity over the cluster.
  • Managing, monitor and maintaining and backup cluster data for disaster recovery.
  • Worked wif project development teams through the development life-cycle.
  • Participated in planning, meetings for implementation path and production rollouts of the solution.
  • Wrote scripts to automate different services, components and tools interaction as per the requirement. .
  • Made secondary environments available for project development and testing.
  • Created Proof of Concept (POC’s) to test and implement new features, functionalities or tool integrations.
  • Wrote technical documents for all cluster related activities and tasks performed.
  • Benchmarking system performance.
  • Implemented new techniques and methodologies to improving system performance and response.
  • Providing immediate support for any project related incidences and issues.
  • Supporting projects deployed on platform
  • Making changes to the existing projects, dependent on environment configuration or code logic changes.
  • Providing information/documentation about the system capabilities and facilities so that system can be leveraged in upcoming projects.

Environment: Hortonworks, HDFS, YARN, MapR, Hive, Tez, HBase, Pig, Sqoop, Oozie, Zookeeper, Falcon, Storm, Flume, Ambari, Infra, Matrics Collector, Atlas, Kafka, Ranger, KMS, Spark, Zeppelin Notebook, Kerberos.

Hadoop Administrator

Confidential, Sunnyvale, CA

Responsibilities:

  • Worked as Hadoop Admin and responsible for taking care of everything related to the clusters total of 100 nodes ranges from POC (Proof-of-Concept) to PROD clusters.
  • Worked as admin on Cloudera (CDH 5.5.2) distribution for clusters ranges from POC to PROD.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
  • Adding/installation of new components and removal of them through Cloudera Manager.
  • Collaborating wif application teams to install operating system and Hadoop updates, patches, version upgrades.
  • Level 2, 3 SME for current Big Data Clusters at the Client Site and set up standard troubleshooting technique.
  • Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Interacting wif Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations.
  • Imported logs from web servers wif Flume to ingest the data into HDFS.
  • Using Flume and Spool directory loading the data from local system to HDFS.
  • Experience in Chef, Puppet or related tools for configuration management.
  • Retrieved data from HDFS into relational databases wif Sqoop.
  • Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis Fine tuning hive jobs for optimized performance.
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Partitioned and queried the data in Hive for further analysis by the BI team.
  • Extending the functionality of Hive and Pig wif custom UDF s and UDAF's.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop. .
  • Creating collections and configurations, Register a Lily HBase Indexer configuration wif the Lily HBase Indexer Service.
  • Creating and truncating HBase tables in hue and taking backup of submitter ID(s).
  • Configuring, Managing permissions for the users in hue.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Troubleshooting, debugging & fixing Talend specific issues, while maintaining the health and performance of the ETL environment
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts wif Pig and Sqoop.

Environment: HDFS, Map Reduce, Hive 1.1.0, Hue 3.9.0, Pig, Flume, Oozie, Sqoop, CDH5, Apache Hadoop 2.6, Spark, SOLR, Storm, Knox, Cloudera Manager, Red Hat, MySQL and Oracle.

Hadoop Administrator

Confidential, Dallas, TX

Responsibilities:

  • Worked in implementing Hadoop wif the AWS EC2 system using a few instances in gathering and analyzing data log files. Developed Use cases and Technical prototyping for implementing PIG, HDP, HIVE and HBASE.
  • Worked as a lead on Big Data Integration and Analytics based on Hadoop, SOLR and web methods technologies. Setting up and supporting Cassandra (1.2) /DataStax (3.2) for POC and prod environments using industry's best practices.
  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Tuned the Hadoop Clusters and Monitored for the memory management and for the Map Reduce jobs, to enable healthy operation of Map reduce jobs to push the data from SQL to NoSQL store.
  • Hands on experience installing, configuring, administering, debugging and troubleshooting Apache and Datastax Cassandra clusters.
  • Built, Stood up and delivered HADOOP cluster in Pseudo distributed Mode wif NameNode, Secondary Name node, Job Tracker, and the Task tracker running successfully wif Zookeeper installed, configured and Apache Accumulo ( NO SQL Google's Big table) is stood up in Single VM environment.
  • Hands on experience installing, configuring, administering, debugging and troubleshooting Apache and Datastax Cassandra clusters.
  • Used the Spark - Cassandra Connector to load data to and from Cassandra.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.

Environment: Hadoop, HDFS, Map Reduce, Impala, Splunk, Sqoop, HBase, Hive, Flume, Oozie, Zoo keeper, solr, Performance tuning, cluster health, monitoring security, Shell Scripting, NoSQL/HBase/Cassandra, Cloudera Manager.

Linux/Unix Administrator

Confidential

Responsibilities:

  • Responsible for monitoring overall project and reporting status to stakeholders.
  • Developed project user guide documents which help in knowledge transfer to new testers and solution repository document which gives quick resolution of any issues occurred in the past thereby reducing the number of invalid defects.
  • Identify repeated issues in production by analyzing production tickets after each release and strengthen the system testing process to arrest those issues moving to production to enhance customer satisfaction
  • Designed and coordinated creation of Manual Test cases according to requirement and executed them to verify the functionality of the application.
  • Responsible for preventive maintenance of the servers on monthly basis. Configuration of the RAID for the servers. Resource management using the Disk quotas.
  • Responsible for change management release scheduled by service providers.
  • Generating the weekly and monthly reports for the tickets that worked on and sending report to the management.
  • Established/implemented firewall rules, Validated rules wif vulnerability scanning tools.
  • Proactively detecting Computer Security violations, collecting evidence and presenting results to the management.
  • Accomplished System/e-mail authentication using LDAP enterprise Database.
  • Implemented a Database enabled Intranet web site using LINUX, Apache, MySQL Database backend.
  • Installed Cent OS using Pre-Execution environment boot and Kick-start method on multiple servers. Monitoring System Metrics and logs for any problems.
  • Running Cron-tab to back up Data. Applied Operating System updates, patches and configuration changes.

Environment: Windows 2008/2007 server, Unix Shell Scripting, SQL Manager Studio, Red Hat Linux, Microsoft SQL Server 2000/2005/2008, MS Access, NoSQL, Linux/Unix, Putty Connection Manager, Putty, SSH.

We'd love your feedback!