We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

3.00/5 (Submit Your Rating)

San Francisco, CA

PROFESSIONAL EXPERIENCE:

  • Around 8 years of experience in IT with over around 4 years of hands - on experience as Hadoop Administrator.
  • Hands on experience in deploying and managing multi-node development, testing and production of Hadoop Cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, ZOOKEEPER, HBASE) using Cloudera Manager and Hortonworks Ambari.
  • Hand on experience in Big Data Technologies/Framework like Hadoop, HDFS, YARN, MapReduce, HBase, Hive, Pig, Sqoop, NoSQL, Flume, Oozie.
  • Proficiency with the application servers like WebSphere, WebLogic, JBOSS and Tomcat.
  • Performed administrative tasks on Hadoop Clusters using Cloudera/Hortonworks.
  • Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
  • Experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
  • Experience in administering Tableau and Green Plum databases instances in various environments.
  • Experience in administration of Kafka and Flume streaming using Cloudera Distribution.
  • Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
  • Good experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL, and DB2.
  • Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
  • Good understanding in Deployment of Hadoop Clusters using Automated Puppet scripts
  • Worked on NoSQL databases including Hbase, Cassandra and MongoDB.
  • Designing and implementing security for Hadoop cluster with Kerberos secure authentication.
  • Hands on experience on Nagios and Ganglia tool for cluster monitoring system.
  • Strong experience in System Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning on Linux (RHEL) systems.
  • Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring on Linux systems.
  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.

TECHNICAL SKILLS:

Big Data Tools: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Flume, Oozie, Kafka,hortonwork.

Hadoop Distribution:: Cloudera Distribution of Hadoop (CDH).

Operating Systems: UNIX, Linux, Windows XP, Windows Vista, Windows 2003 Server

Servers: Web logic server, WebSphere and Jboss.

Programming Languages: Java, Pl SQL, Shell Script, perl, python .

Tools: Interwoven Teamsite, GMS, BMC Remedy, Eclipse, Toad, SQL Server Management Studio, Jenkins, GitHub, Test NG, JUnit.

Database: MySQL, NoSQL, Couchbase, InfluxDB, Teradata, HBase, MongoDB, Cassandra, Oracle.

Processes: Incident Management, Release Management, Change Management

Office Tools: MS Outlook, MS Word, MS Excel, MS PowerPoint.

WORK EXPERIENCE:

Sr. Hadoop Administrator

Confidential - San Francisco, CA

Responsibilities:

  • Installed/Configured/Maintained Apache Hadoop and Cloudera Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
  • Managing and scheduling Jobs on Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions.
  • Expertise with NoSQL databases like Hbase, Cassandra, DynamoDB (AWS) and MongoDB.
  • Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
  • Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment
  • Responsible for developing data pipeline using HD Insight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Created POC to store Server Log data into Cassandra to identify System Alert Metrics and Implemented Cassandra connector for Spark in Java.
  • Load log data into HDFS using Flume, Kafka and performing ETL integrations.
  • Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
  • Extracted the data from Teradata into HDFS using the Sqoop.
  • Secure Hadoop clusters and CDH applications for user authentication and authorization using Kerberos deployment.
  • Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Installed and configured Hadoop, MapReduce, HDFS developed multiple MapReduce jobs in java for data cleaning and Upgradation Cloudera from 5.5 to 6.0 version.
  • Automate repetitive tasks, deploy critical applications and manage change on several servers using Puppet.
  • Troubleshot and rectified platform and network issues using Splunk / Wireshark.
  • Extracted files from Cassendra Database through Sqoop and placed in HDFS and processed
  • Cloudera Enterprise Navigator for Hadoop Audit files and Data Lineage.
  • Created tables, secondary indexes, join indexes and views in Teradata development Environment for testing.
  • Experience in methodologies such as Agile, Scrum, Test NG, Junit and Test driven development.
  • Working as Hadoop Administrator clusters with Hortonworks Distribution.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, Cassandra and slots configuration.
  • Created HD Insight cluster in Azure (Microsoft Specific tool) was part of the deployment and Component unit testing using Azure Emulator.
  • Lead BigData Hadoop/YARN Operations and managed an off-shore team.
  • Provided support on Kerberos related issues and Coordinated Hadoop installations/upgrades and patch installations in the environment.

Environment: Hive, Pig, HBase, Zookeeper and Sqoop, ETL, Azure, Ambari 2.0, Linux Cent OS, HBase, Splunk, MongoDB, Teradata, Puppet, Kafka Cassandra, Ganglia and Cloudera Mana, Agile/scrum.

Hadoop Admin / Kafka

Confidential - Richardson, TX

Responsibilities:

  • Worked on analysing Hadoop cluster and different big data analytic tools including Pig, HBase, NoSQL, databases, Flume, Oozie and Sqoop.
  • Managed mission-critical Hadoop cluster and Kafka at production scale, especially Cloudera distribution.
  • Worked with Kafka for the proof of concept for carrying out log processing on a distributed system.
  • Involved in capacity planning, with reference to the growing data size and the existing cluster size.
  • Experience in designing, implementing and maintaining of high performing Bigdata, Hadoop clusters and integrating them with existing infrastructure.
  • Used NoSQL database with Cassandra, MongoDB, Monod and Designed table architecture and developed DAO layer.
  • Deployed the application and tested on Websphere Application Servers.
  • Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
  • Experience in methodologies such as Agile, Scrum, and Test driven development.
  • Creating principles for new users in the Kerberos and Implemented and maintained Kerberos cluster and integrated with the Active Directories (AD).
  • Developed a data pipeline using Kafka and Storm to store data into Hdfs.
  • Creating event processing data pipelines and handling messaging services using Apache Kafka.
  • Involved in migrating java test framework to python flask.
  • Shell scripting for Linux/Unix Systems Administration and related tasks. Point of Contact for Vendor escalation.
  • Monitoring and analysing MapReduce jobs and look out for any potential issues and address them.
  • Collected the logs data from web servers and integrated into HDFS using Flume.
  • Moving the data from Oracle, Teradata, MySQL into HDFS using Sqoop and importing various formats of flat files into HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.
  • Commissioning and Decommissioning Hadoop Cluster Nodes Including Load Balancing HDFS block data.
  • Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment.
  • Good knowledge in implementing Name Node Federation and High Availability of Name Node and Hadoop Cluster using Zookeeper and Quorum-Journal Manager.
  • Good knowledge in adding security to the cluster using Kerberos and Sentry.
  • Experience in Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Exported the patterns analyzed back to Teradata using Sqoop.
  • Hands-On experience in setting up ACL (Access Control Lists) to secure access to the HDFS file system.
  • Analyze escalated incidences within the Azure SQL database.
  • Captured the data logs from web server into HDFS using Flume & Splunk for analysis..
  • Experience managing users and permissions on the cluster, using different authentication methods.
  • Involved in regular Hadoop Cluster maintenance such as updating system packages.
  • Experience in managing and analysing Hadoop log files to look troubleshooting issues.
  • Good knowledge in NoSQL databases, like HBase, MongoDB, etc.
  • Worked on Hadoop Hortonworks distribution which managed services viz. HDFS, MapReduce2

Environment: Hadoop, YARN, Hive, HBase, Flume, Kafka, Oozie and Sqoop, Linux, MapReduce, HDFS, Teradata Splunk, Java, Jenkins, GitHub, MySQL, Hortonwork,NoSQL,MongoDB, Java, Shell Script, python.

Hadoop Administrator

Confidential - Houston, TX

Responsibilities:

  • Installed/Configured/Maintained Apache Hadoop and Cloudera Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
  • Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Knowledge on supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud.
  • Worked on Providing User support and application support through remedy ticket management system on Hadoop Infrastructure.
  • Installed and configured Hadoop cluster in Development, Testing and Production environments.
  • Performed both major and minor upgrades to the existing CDH cluster.
  • Installation of various Hadoop Ecosystems and Hadoop Daemons.
  • Installed and configured Flume agents with well-defined sources, channels and sinks.
  • Configured safety valve to create active directory filters to sync the LDAP directory for Hue.
  • Developed scripts to delete the empty Hive tables existing in the Hadoop file system.
  • Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MapReduce, HIVE, SQOOP and Pig Latin.
  • Implemented Name Node backup using NFS. This was done for High availability.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Wrote shell scripts for rolling day-to-day processes and it automated using Crontab.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Implemented FIFO schedulers on the Job tracker to share the resources of the Cluster for the MapReduce jobs given by the users.
  • Involved in Data model sessions to develop models for HIVE tables.

Environment: Apache Hadoop, CDH4, Hive, Hue, Pig, Hbase, MapReduce, Sqoop, RedHat, CentOS and Flume, MySQL, NoSQL, MongoDB, Java.

Hadoop Administrator

Confidential - Los Angeles, CA

Responsibilities:

  • Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
  • Installed and configured Hadoop and Ecosystem components in Cloudera and Hortonworks environments. configured Hadoop, Hive and Pig on Amazon EC2 servers.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
  • Installed and configured Hive, Pig, Sqoop and Oozie on the HDP 2.2 cluster and Implemented Sentry for the Dev Cluster.
  • Configured MySQL Database to store Hive metadata.
  • Involved in managing and reviewing Hadoop log files.
  • Involved in running Hadoop streaming jobs to process terabytes of text data.
  • Worked with Linux systems and MySQL database on a regular basis.
  • Supported Map Reduce Programs those ran on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Monitoring cluster job performance and involved capacity planning
  • Works with application teams to install operating system and Hadoop updates, patches, Version upgrades as required.

Environment: HDFS, Hive, Pig, sentry, Kerberos, LDAP, YARN, Cloudera Manager, and Ambari.

Linux/Systems Administrator

Confidential

Responsibilities:

  • Experience with Linux internals, virtual machines, and open source tools/platforms.
  • Installation, Configuration, Upgradation and administration of Windows, Sun Solaris, RedHat Linux and Solaris.
  • Linux and Solaris installation, administration and maintenance.
  • Ensured data recoverability by implementing system and application level backups.
  • Performed various configurations which include networking and IP Tables, resolving hostnames, SSH key less login.
  • Managed CRONTAB jobs, batch processing and job scheduling.
  • Security, users and groups administration.
  • Worked on Linux Kick-start OS integration, DDNS, DHCP, SMTP, Samba, NFS, FTP, SSH, and LDAP integration.
  • Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
  • Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
  • Automate administration tasks through use of scripting and Job Scheduling using CRON.
  • Performance tuning for high transaction and volumes data in mission critical environment.
  • Setting up alert and level for MySQL (uptime, Users, Replication information, Alert based on different query).
  • Estimate MySQL database capacities; develop methods for monitoring database capacity and usage.
  • Develop and optimize physical design of MySQL database systems.
  • Support in development and testing environment to measure the performance before deploying to the production.

Environment: MYSQL 5.1.4, TCP/IP, LVM, PHP, SHELL SCRIPT, NETWORKING, APACHE, MYSQL WORKBENCH, TOAD, LINUX 5.0, 5.1.

Linux Administrator

Confidential

Responsibilities:

  • Installed and deployed RPM Packages.
  • Storage management using JBOD, RAID Levels 0, 1, Logical Volumes, Volume Groups and Partitioning.
  • Analyzed the Performance of the Linux System to identify Memory, disk I/O and network problem.
  • Performed reorganization of disk partitions, file systems, hard disk addition, and memory upgrade.
  • Administration of RedHat4.x, 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts, Xen servers.
  • Logs & Resource Monitoring via Script on Linux Server.
  • Maintained and monitored Local area network and Hardware Support and Virtualization on RHEL server (Through Xen & KVM Server).
  • Administration of VMware virtual Linux server and resizing of LVM disk volumes as required.
  • Respond to all Linux systems problems 24x7 as a part of on call rotation and resolving them on a timely basis.

Environment: Linux, TCP/IP, LVM, RAID, Networking, Security, user management.

We'd love your feedback!