Sr. Hadoop Administrator Resume San Francisco, CA - Hire IT People

PROFESSIONAL EXPERIENCE:

Around 8 years of experience in IT with over around 4 years of hands - on experience as Hadoop Administrator.
Hands on experience in deploying and managing multi-node development, testing and production of Hadoop Cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, ZOOKEEPER, HBASE) using Cloudera Manager and Hortonworks Ambari.
Hand on experience in Big Data Technologies/Framework like Hadoop, HDFS, YARN, MapReduce, HBase, Hive, Pig, Sqoop, NoSQL, Flume, Oozie.
Proficiency with the application servers like WebSphere, WebLogic, JBOSS and Tomcat.
Performed administrative tasks on Hadoop Clusters using Cloudera/Hortonworks.
Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
Experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
Experience in administering Tableau and Green Plum databases instances in various environments.
Experience in administration of Kafka and Flume streaming using Cloudera Distribution.
Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
Good experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL, and DB2.
Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
Good understanding in Deployment of Hadoop Clusters using Automated Puppet scripts
Worked on NoSQL databases including Hbase, Cassandra and MongoDB.
Designing and implementing security for Hadoop cluster with Kerberos secure authentication.
Hands on experience on Nagios and Ganglia tool for cluster monitoring system.
Strong experience in System Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning on Linux (RHEL) systems.
Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring on Linux systems.
Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.

TECHNICAL SKILLS:

Big Data Tools: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Flume, Oozie, Kafka,hortonwork.

Hadoop Distribution:: Cloudera Distribution of Hadoop (CDH).

Operating Systems: UNIX, Linux, Windows XP, Windows Vista, Windows 2003 Server

Servers: Web logic server, WebSphere and Jboss.

Programming Languages: Java, Pl SQL, Shell Script, perl, python .

Tools: Interwoven Teamsite, GMS, BMC Remedy, Eclipse, Toad, SQL Server Management Studio, Jenkins, GitHub, Test NG, JUnit.

Database: MySQL, NoSQL, Couchbase, InfluxDB, Teradata, HBase, MongoDB, Cassandra, Oracle.

Processes: Incident Management, Release Management, Change Management

Office Tools: MS Outlook, MS Word, MS Excel, MS PowerPoint.

WORK EXPERIENCE:

Sr. Hadoop Administrator

Confidential - San Francisco, CA

Responsibilities:

Installed/Configured/Maintained Apache Hadoop and Cloudera Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
Managing and scheduling Jobs on Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions.
Expertise with NoSQL databases like Hbase, Cassandra, DynamoDB (AWS) and MongoDB.
Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment
Responsible for developing data pipeline using HD Insight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
Created POC to store Server Log data into Cassandra to identify System Alert Metrics and Implemented Cassandra connector for Spark in Java.
Load log data into HDFS using Flume, Kafka and performing ETL integrations.
Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
Extracted the data from Teradata into HDFS using the Sqoop.
Secure Hadoop clusters and CDH applications for user authentication and authorization using Kerberos deployment.
Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Installed and configured Hadoop, MapReduce, HDFS developed multiple MapReduce jobs in java for data cleaning and Upgradation Cloudera from 5.5 to 6.0 version.
Automate repetitive tasks, deploy critical applications and manage change on several servers using Puppet.
Troubleshot and rectified platform and network issues using Splunk / Wireshark.
Extracted files from Cassendra Database through Sqoop and placed in HDFS and processed
Cloudera Enterprise Navigator for Hadoop Audit files and Data Lineage.
Created tables, secondary indexes, join indexes and views in Teradata development Environment for testing.
Experience in methodologies such as Agile, Scrum, Test NG, Junit and Test driven development.
Working as Hadoop Administrator clusters with Hortonworks Distribution.
Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, Cassandra and slots configuration.
Created HD Insight cluster in Azure (Microsoft Specific tool) was part of the deployment and Component unit testing using Azure Emulator.
Lead BigData Hadoop/YARN Operations and managed an off-shore team.
Provided support on Kerberos related issues and Coordinated Hadoop installations/upgrades and patch installations in the environment.

Environment: Hive, Pig, HBase, Zookeeper and Sqoop, ETL, Azure, Ambari 2.0, Linux Cent OS, HBase, Splunk, MongoDB, Teradata, Puppet, Kafka Cassandra, Ganglia and Cloudera Mana, Agile/scrum.

Hadoop Admin / Kafka

Confidential - Richardson, TX

Responsibilities:

Worked on analysing Hadoop cluster and different big data analytic tools including Pig, HBase, NoSQL, databases, Flume, Oozie and Sqoop.
Managed mission-critical Hadoop cluster and Kafka at production scale, especially Cloudera distribution.
Worked with Kafka for the proof of concept for carrying out log processing on a distributed system.
Involved in capacity planning, with reference to the growing data size and the existing cluster size.
Experience in designing, implementing and maintaining of high performing Bigdata, Hadoop clusters and integrating them with existing infrastructure.
Used NoSQL database with Cassandra, MongoDB, Monod and Designed table architecture and developed DAO layer.
Deployed the application and tested on Websphere Application Servers.
Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
Experience in methodologies such as Agile, Scrum, and Test driven development.
Creating principles for new users in the Kerberos and Implemented and maintained Kerberos cluster and integrated with the Active Directories (AD).
Developed a data pipeline using Kafka and Storm to store data into Hdfs.
Creating event processing data pipelines and handling messaging services using Apache Kafka.
Involved in migrating java test framework to python flask.
Shell scripting for Linux/Unix Systems Administration and related tasks. Point of Contact for Vendor escalation.
Monitoring and analysing MapReduce jobs and look out for any potential issues and address them.
Collected the logs data from web servers and integrated into HDFS using Flume.
Moving the data from Oracle, Teradata, MySQL into HDFS using Sqoop and importing various formats of flat files into HDFS.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.
Commissioning and Decommissioning Hadoop Cluster Nodes Including Load Balancing HDFS block data.
Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment.
Good knowledge in implementing Name Node Federation and High Availability of Name Node and Hadoop Cluster using Zookeeper and Quorum-Journal Manager.
Good knowledge in adding security to the cluster using Kerberos and Sentry.
Experience in Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade.
Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.
Exported the patterns analyzed back to Teradata using Sqoop.
Hands-On experience in setting up ACL (Access Control Lists) to secure access to the HDFS file system.
Analyze escalated incidences within the Azure SQL database.
Captured the data logs from web server into HDFS using Flume & Splunk for analysis..
Experience managing users and permissions on the cluster, using different authentication methods.
Involved in regular Hadoop Cluster maintenance such as updating system packages.
Experience in managing and analysing Hadoop log files to look troubleshooting issues.
Good knowledge in NoSQL databases, like HBase, MongoDB, etc.
Worked on Hadoop Hortonworks distribution which managed services viz. HDFS, MapReduce2

Environment: Hadoop, YARN, Hive, HBase, Flume, Kafka, Oozie and Sqoop, Linux, MapReduce, HDFS, Teradata Splunk, Java, Jenkins, GitHub, MySQL, Hortonwork,NoSQL,MongoDB, Java, Shell Script, python.

Hadoop Administrator

Confidential - Houston, TX

Responsibilities:

Installed/Configured/Maintained Apache Hadoop and Cloudera Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Managing and scheduling Jobs on a Hadoop cluster.
Knowledge on supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud.
Worked on Providing User support and application support through remedy ticket management system on Hadoop Infrastructure.
Installed and configured Hadoop cluster in Development, Testing and Production environments.
Performed both major and minor upgrades to the existing CDH cluster.
Installation of various Hadoop Ecosystems and Hadoop Daemons.
Installed and configured Flume agents with well-defined sources, channels and sinks.
Configured safety valve to create active directory filters to sync the LDAP directory for Hue.
Developed scripts to delete the empty Hive tables existing in the Hadoop file system.
Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MapReduce, HIVE, SQOOP and Pig Latin.
Implemented Name Node backup using NFS. This was done for High availability.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Wrote shell scripts for rolling day-to-day processes and it automated using Crontab.
Collected the logs data from web servers and integrated in to HDFS using Flume.
Implemented FIFO schedulers on the Job tracker to share the resources of the Cluster for the MapReduce jobs given by the users.
Involved in Data model sessions to develop models for HIVE tables.

Environment: Apache Hadoop, CDH4, Hive, Hue, Pig, Hbase, MapReduce, Sqoop, RedHat, CentOS and Flume, MySQL, NoSQL, MongoDB, Java.

Hadoop Administrator

Confidential - Los Angeles, CA

Responsibilities:

Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
Installed and configured Hadoop and Ecosystem components in Cloudera and Hortonworks environments. configured Hadoop, Hive and Pig on Amazon EC2 servers.
Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
Installed and configured Hive, Pig, Sqoop and Oozie on the HDP 2.2 cluster and Implemented Sentry for the Dev Cluster.
Configured MySQL Database to store Hive metadata.
Involved in managing and reviewing Hadoop log files.
Involved in running Hadoop streaming jobs to process terabytes of text data.
Worked with Linux systems and MySQL database on a regular basis.
Supported Map Reduce Programs those ran on the cluster.
Involved in loading data from UNIX file system to HDFS.
Monitoring cluster job performance and involved capacity planning
Works with application teams to install operating system and Hadoop updates, patches, Version upgrades as required.

Environment: HDFS, Hive, Pig, sentry, Kerberos, LDAP, YARN, Cloudera Manager, and Ambari.

Linux/Systems Administrator

Confidential

Responsibilities:

Experience with Linux internals, virtual machines, and open source tools/platforms.
Installation, Configuration, Upgradation and administration of Windows, Sun Solaris, RedHat Linux and Solaris.
Linux and Solaris installation, administration and maintenance.
Ensured data recoverability by implementing system and application level backups.
Performed various configurations which include networking and IP Tables, resolving hostnames, SSH key less login.
Managed CRONTAB jobs, batch processing and job scheduling.
Security, users and groups administration.
Worked on Linux Kick-start OS integration, DDNS, DHCP, SMTP, Samba, NFS, FTP, SSH, and LDAP integration.
Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
Automate administration tasks through use of scripting and Job Scheduling using CRON.
Performance tuning for high transaction and volumes data in mission critical environment.
Setting up alert and level for MySQL (uptime, Users, Replication information, Alert based on different query).
Estimate MySQL database capacities; develop methods for monitoring database capacity and usage.
Develop and optimize physical design of MySQL database systems.
Support in development and testing environment to measure the performance before deploying to the production.

Environment: MYSQL 5.1.4, TCP/IP, LVM, PHP, SHELL SCRIPT, NETWORKING, APACHE, MYSQL WORKBENCH, TOAD, LINUX 5.0, 5.1.

Linux Administrator

Confidential

Responsibilities:

Installed and deployed RPM Packages.
Storage management using JBOD, RAID Levels 0, 1, Logical Volumes, Volume Groups and Partitioning.
Analyzed the Performance of the Linux System to identify Memory, disk I/O and network problem.
Performed reorganization of disk partitions, file systems, hard disk addition, and memory upgrade.
Administration of RedHat4.x, 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts, Xen servers.
Logs & Resource Monitoring via Script on Linux Server.
Maintained and monitored Local area network and Hardware Support and Virtualization on RHEL server (Through Xen & KVM Server).
Administration of VMware virtual Linux server and resizing of LVM disk volumes as required.
Respond to all Linux systems problems 24x7 as a part of on call rotation and resolving them on a timely basis.

Environment: Linux, TCP/IP, LVM, RAID, Networking, Security, user management.

We provide IT Staff Augmentation Services!

Sr. Hadoop Administrator Resume

San Francisco, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship