Sr. Hadoop Administrator Resume
San Francisco, CA
PROFESSIONAL EXPERIENCE:
- Around 8 years of experience in IT with over around 4 years of hands - on experience as Hadoop Administrator.
- Hands on experience in deploying and managing multi-node development, testing and production of Hadoop Cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, ZOOKEEPER, HBASE) using Cloudera Manager and Hortonworks Ambari.
- Hand on experience in Big Data Technologies/Framework like Hadoop, HDFS, YARN, MapReduce, HBase, Hive, Pig, Sqoop, NoSQL, Flume, Oozie.
- Proficiency with the application servers like WebSphere, WebLogic, JBOSS and Tomcat.
- Performed administrative tasks on Hadoop Clusters using Cloudera/Hortonworks.
- Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
- Experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
- Experience in administering Tableau and Green Plum databases instances in various environments.
- Experience in administration of Kafka and Flume streaming using Cloudera Distribution.
- Hands on experience in Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
- Good experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL, and DB2.
- Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
- Good understanding in Deployment of Hadoop Clusters using Automated Puppet scripts
- Worked on NoSQL databases including Hbase, Cassandra and MongoDB.
- Designing and implementing security for Hadoop cluster with Kerberos secure authentication.
- Hands on experience on Nagios and Ganglia tool for cluster monitoring system.
- Strong experience in System Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning on Linux (RHEL) systems.
- Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring on Linux systems.
- Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
TECHNICAL SKILLS:
Big Data Tools: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Flume, Oozie, Kafka,hortonwork.
Hadoop Distribution:: Cloudera Distribution of Hadoop (CDH).
Operating Systems: UNIX, Linux, Windows XP, Windows Vista, Windows 2003 Server
Servers: Web logic server, WebSphere and Jboss.
Programming Languages: Java, Pl SQL, Shell Script, perl, python .
Tools: Interwoven Teamsite, GMS, BMC Remedy, Eclipse, Toad, SQL Server Management Studio, Jenkins, GitHub, Test NG, JUnit.
Database: MySQL, NoSQL, Couchbase, InfluxDB, Teradata, HBase, MongoDB, Cassandra, Oracle.
Processes: Incident Management, Release Management, Change Management
Office Tools: MS Outlook, MS Word, MS Excel, MS PowerPoint.
WORK EXPERIENCE:
Sr. Hadoop Administrator
Confidential - San Francisco, CA
Responsibilities:
- Installed/Configured/Maintained Apache Hadoop and Cloudera Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
- Managing and scheduling Jobs on Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions.
- Expertise with NoSQL databases like Hbase, Cassandra, DynamoDB (AWS) and MongoDB.
- Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
- Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment
- Responsible for developing data pipeline using HD Insight, flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Created POC to store Server Log data into Cassandra to identify System Alert Metrics and Implemented Cassandra connector for Spark in Java.
- Load log data into HDFS using Flume, Kafka and performing ETL integrations.
- Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.
- Extracted the data from Teradata into HDFS using the Sqoop.
- Secure Hadoop clusters and CDH applications for user authentication and authorization using Kerberos deployment.
- Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Installed and configured Hadoop, MapReduce, HDFS developed multiple MapReduce jobs in java for data cleaning and Upgradation Cloudera from 5.5 to 6.0 version.
- Automate repetitive tasks, deploy critical applications and manage change on several servers using Puppet.
- Troubleshot and rectified platform and network issues using Splunk / Wireshark.
- Extracted files from Cassendra Database through Sqoop and placed in HDFS and processed
- Cloudera Enterprise Navigator for Hadoop Audit files and Data Lineage.
- Created tables, secondary indexes, join indexes and views in Teradata development Environment for testing.
- Experience in methodologies such as Agile, Scrum, Test NG, Junit and Test driven development.
- Working as Hadoop Administrator clusters with Hortonworks Distribution.
- Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, Cassandra and slots configuration.
- Created HD Insight cluster in Azure (Microsoft Specific tool) was part of the deployment and Component unit testing using Azure Emulator.
- Lead BigData Hadoop/YARN Operations and managed an off-shore team.
- Provided support on Kerberos related issues and Coordinated Hadoop installations/upgrades and patch installations in the environment.
Environment: Hive, Pig, HBase, Zookeeper and Sqoop, ETL, Azure, Ambari 2.0, Linux Cent OS, HBase, Splunk, MongoDB, Teradata, Puppet, Kafka Cassandra, Ganglia and Cloudera Mana, Agile/scrum.
Hadoop Admin / Kafka
Confidential - Richardson, TX
Responsibilities:
- Worked on analysing Hadoop cluster and different big data analytic tools including Pig, HBase, NoSQL, databases, Flume, Oozie and Sqoop.
- Managed mission-critical Hadoop cluster and Kafka at production scale, especially Cloudera distribution.
- Worked with Kafka for the proof of concept for carrying out log processing on a distributed system.
- Involved in capacity planning, with reference to the growing data size and the existing cluster size.
- Experience in designing, implementing and maintaining of high performing Bigdata, Hadoop clusters and integrating them with existing infrastructure.
- Used NoSQL database with Cassandra, MongoDB, Monod and Designed table architecture and developed DAO layer.
- Deployed the application and tested on Websphere Application Servers.
- Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
- Experience in methodologies such as Agile, Scrum, and Test driven development.
- Creating principles for new users in the Kerberos and Implemented and maintained Kerberos cluster and integrated with the Active Directories (AD).
- Developed a data pipeline using Kafka and Storm to store data into Hdfs.
- Creating event processing data pipelines and handling messaging services using Apache Kafka.
- Involved in migrating java test framework to python flask.
- Shell scripting for Linux/Unix Systems Administration and related tasks. Point of Contact for Vendor escalation.
- Monitoring and analysing MapReduce jobs and look out for any potential issues and address them.
- Collected the logs data from web servers and integrated into HDFS using Flume.
- Moving the data from Oracle, Teradata, MySQL into HDFS using Sqoop and importing various formats of flat files into HDFS.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.
- Commissioning and Decommissioning Hadoop Cluster Nodes Including Load Balancing HDFS block data.
- Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment.
- Good knowledge in implementing Name Node Federation and High Availability of Name Node and Hadoop Cluster using Zookeeper and Quorum-Journal Manager.
- Good knowledge in adding security to the cluster using Kerberos and Sentry.
- Experience in Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade.
- Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.
- Exported the patterns analyzed back to Teradata using Sqoop.
- Hands-On experience in setting up ACL (Access Control Lists) to secure access to the HDFS file system.
- Analyze escalated incidences within the Azure SQL database.
- Captured the data logs from web server into HDFS using Flume & Splunk for analysis..
- Experience managing users and permissions on the cluster, using different authentication methods.
- Involved in regular Hadoop Cluster maintenance such as updating system packages.
- Experience in managing and analysing Hadoop log files to look troubleshooting issues.
- Good knowledge in NoSQL databases, like HBase, MongoDB, etc.
- Worked on Hadoop Hortonworks distribution which managed services viz. HDFS, MapReduce2
Environment: Hadoop, YARN, Hive, HBase, Flume, Kafka, Oozie and Sqoop, Linux, MapReduce, HDFS, Teradata Splunk, Java, Jenkins, GitHub, MySQL, Hortonwork,NoSQL,MongoDB, Java, Shell Script, python.
Hadoop Administrator
Confidential - Houston, TX
Responsibilities:
- Installed/Configured/Maintained Apache Hadoop and Cloudera Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Managing and scheduling Jobs on a Hadoop cluster.
- Knowledge on supporting data analysis projects using Elastic Map Reduce on the Amazon Web Services (AWS) cloud.
- Worked on Providing User support and application support through remedy ticket management system on Hadoop Infrastructure.
- Installed and configured Hadoop cluster in Development, Testing and Production environments.
- Performed both major and minor upgrades to the existing CDH cluster.
- Installation of various Hadoop Ecosystems and Hadoop Daemons.
- Installed and configured Flume agents with well-defined sources, channels and sinks.
- Configured safety valve to create active directory filters to sync the LDAP directory for Hue.
- Developed scripts to delete the empty Hive tables existing in the Hadoop file system.
- Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MapReduce, HIVE, SQOOP and Pig Latin.
- Implemented Name Node backup using NFS. This was done for High availability.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Wrote shell scripts for rolling day-to-day processes and it automated using Crontab.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Implemented FIFO schedulers on the Job tracker to share the resources of the Cluster for the MapReduce jobs given by the users.
- Involved in Data model sessions to develop models for HIVE tables.
Environment: Apache Hadoop, CDH4, Hive, Hue, Pig, Hbase, MapReduce, Sqoop, RedHat, CentOS and Flume, MySQL, NoSQL, MongoDB, Java.
Hadoop Administrator
Confidential - Los Angeles, CA
Responsibilities:
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
- Installed and configured Hadoop and Ecosystem components in Cloudera and Hortonworks environments. configured Hadoop, Hive and Pig on Amazon EC2 servers.
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
- Installed and configured Hive, Pig, Sqoop and Oozie on the HDP 2.2 cluster and Implemented Sentry for the Dev Cluster.
- Configured MySQL Database to store Hive metadata.
- Involved in managing and reviewing Hadoop log files.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Worked with Linux systems and MySQL database on a regular basis.
- Supported Map Reduce Programs those ran on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Monitoring cluster job performance and involved capacity planning
- Works with application teams to install operating system and Hadoop updates, patches, Version upgrades as required.
Environment: HDFS, Hive, Pig, sentry, Kerberos, LDAP, YARN, Cloudera Manager, and Ambari.
Linux/Systems Administrator
Confidential
Responsibilities:
- Experience with Linux internals, virtual machines, and open source tools/platforms.
- Installation, Configuration, Upgradation and administration of Windows, Sun Solaris, RedHat Linux and Solaris.
- Linux and Solaris installation, administration and maintenance.
- Ensured data recoverability by implementing system and application level backups.
- Performed various configurations which include networking and IP Tables, resolving hostnames, SSH key less login.
- Managed CRONTAB jobs, batch processing and job scheduling.
- Security, users and groups administration.
- Worked on Linux Kick-start OS integration, DDNS, DHCP, SMTP, Samba, NFS, FTP, SSH, and LDAP integration.
- Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
- Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
- Automate administration tasks through use of scripting and Job Scheduling using CRON.
- Performance tuning for high transaction and volumes data in mission critical environment.
- Setting up alert and level for MySQL (uptime, Users, Replication information, Alert based on different query).
- Estimate MySQL database capacities; develop methods for monitoring database capacity and usage.
- Develop and optimize physical design of MySQL database systems.
- Support in development and testing environment to measure the performance before deploying to the production.
Environment: MYSQL 5.1.4, TCP/IP, LVM, PHP, SHELL SCRIPT, NETWORKING, APACHE, MYSQL WORKBENCH, TOAD, LINUX 5.0, 5.1.
Linux Administrator
Confidential
Responsibilities:
- Installed and deployed RPM Packages.
- Storage management using JBOD, RAID Levels 0, 1, Logical Volumes, Volume Groups and Partitioning.
- Analyzed the Performance of the Linux System to identify Memory, disk I/O and network problem.
- Performed reorganization of disk partitions, file systems, hard disk addition, and memory upgrade.
- Administration of RedHat4.x, 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
- Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts, Xen servers.
- Logs & Resource Monitoring via Script on Linux Server.
- Maintained and monitored Local area network and Hardware Support and Virtualization on RHEL server (Through Xen & KVM Server).
- Administration of VMware virtual Linux server and resizing of LVM disk volumes as required.
- Respond to all Linux systems problems 24x7 as a part of on call rotation and resolving them on a timely basis.
Environment: Linux, TCP/IP, LVM, RAID, Networking, Security, user management.