Hadoop Administrator Resume
Mountain View, CA
PROFESSIONAL SUMMARY:
- Around 11 years of experience in IT industry, 4+ years of experience in all phases of Hadoop Eco system components and Big Data technologies.
- Experienced in configuring, installing, upgrading and managing Cloudera,Hortonworks&Mapr Hadoop Distributions.
- Hands on experience with Big Data and Hadoop Eco - System components (HDFS, Map Reduce, Yarn, Hive, Hue, Sqoop, Flume, Spark, Oozie, Hbase and Pig).
- Experienced in implementing Big Data projects using Cloudera Distribution.
- Experienced in managing and reviewing Hadoop Log Files.
- Experience in handling of Cloudera’s Hadoop platform along with CDH 4.x and CDH 5.x for Development, Support and Maintenance.
- Hands on experience on configuring a Hadoop cluster in a professional environment and on Amazon Web Services (AWS) using an EC2 instance.
- Hands on experience with monitoring tools including like Nagios and Ganglia.
- Hands on experience in installing, Configuring and managing the Hue and Hcatalog.
- Experienced in working with MapReduce programs and Hive commands to deliver the best results.
- Experienced in implementing High Availability using QJM and NFS to avoid single point of failure.
- Performed Decommissioning, Commissioning, Balancing, managing nodes and tuning server for optimal performance on running cluster.
- Experienced in writing Oozie workflows and job controllers for job automation.
- Having working knowledge on Sqoop and Flume for Data Processing.
- Experienced in monitoring, troubleshooting and performance tuning skills.
- Excellent Knowledge in NoSQL databases like Hbase and Cassandra.
- Experienced in loading data from the different data sources like (Teradata and DB2) into HDFS using Sqoop and load into partitioned Hive tables.
- Overall Strong experience in system Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance monitoring and Fine-tuning on Linux (RHEL) systems.
TECHNICAL SKILLS:
Hadoop/Big Data Technologies: HDFS, MapReduce, YARN, Pig, Hive, Hbase, Oozie, Sqoop, Spark, Cassandra, Solr, Hue, Kafka, Hcatalog, AWS, Data Modeling, MongoDB, Flume & Zookeeper.
Languages and technologies: Java, SQL, NoSQL,Phoniex
Operating Systems: Linux &UNIX.,Windows, MAC.
Databases: MySQL, Oracle, Teradata, Greenplum, PostgreSQL, DB2.
Scripting: Shell Scripting, Pearl Scripting, Python
NOSQL Databases: HBase, Cassandra, MongoDB
Office Tools: MS Word, MS Excel, MS PowerPoint, MS Project
Web/Application Server: Apache 2.4,Tomcat, WebSphere, WebLogic.
PROFESSIONAL EXPERIENCE:
Confidential, Mountain view, CA
Hadoop Administrator
Responsibilities:
- Commissioning and Decommissioning the hadoop nodes & Data Re-balancing
- Installed and configured Hadoop, Hive, Zookeeper, Spark and Oozie for POC
- Upgraded from Cloudera manager and CDH from 5.7.1 to CDH 5.11.2 worked with cloudera to setup SSO login worked on datamigration in HIVE. worked on Importing and exporting data into HDFS and Hive using Sqoop
- Inloved in onboarding hadoop users, helping them in resloving user access issues automated user access and creating databases in Hive.
- Integrated spluk with cloudera to send logs to splunk.
- Inloved in deploying datasets from differnt pipelines to Hadoop.
- Worked on performance tuning of the Hadoop cluster.
Confidential, Wilmington, DE
Hadoop Administrator
Responsibilities:
- Installed and configured Hadoop, Hive, Zookeeper, Spark and Oozie
- Maintaining data backups for NameNode
- Troubleshooting, Administering and Optimizing Hadoop performance
- Commissioning and Decommissioning the hadoop nodes & Data Re-balancing
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Importing and exporting data from HDFS using Sqoop
- Management of Hadoop log files
- Importing and exporting data into HDFS and Hive using Sqoop
- Responsible to manage data coming from different sources.
- Decommissioning and commissioning the Node on running hadoop cluster.
- Migrated zookeeper from one node to another.
- Maintaining cluster health and HDFS space for better performance
- Upgraded to HDFS High Availability.
- Upgraded from MR1 to Yarn and enabled high availability.
Confidential, St.Louis, MO
Hadoop Administrator
Reponsibilities:
- Installed Cloudera CDH 5.7.1 Installaion with Hadoop Eco-Systems for Poc’s, Dev and UAT environments.
- Installed and configured MySQL as external database for the services and MySQL Master - Slave setup as backup.
- Handling the data movement between HDFS and different web sources using Flume and Sqoop.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Commissioning Data Nodes when data grew and De-commissioning of data nodes from cluster in hardware degraded.
- Setting up and managing HA Name Node to avoid single point of failures in large clusters.
- Worked with different applications teams to integrate with Hadoop.
- ImplementedKerberos with AD for authenticating and configured Senty for authorizing the services in Hadoop Cluster. configured five node Kafka cluster in prodution using cloudera manager.
- Validated Kafka integration with Spark streaming.
- Upgraded from CDH 5.7.1 to CDH 5.7.2
- Involved in cluster capacity planning, Hardware planning, Installation, Performance tuning of the Hadoop cluster.
- Hands on experience in provisioning and managing multi-node Hadoop Clusters on public cloud environment Amazon Web Services (AWS) - EC2 and on private cloud infrastructure.
Confidential, Orlando, FL
Hadoop Administrator
Reponsibilities:
- Responsible for Cluster Maintenance, Monitoring, Managing, Commissioning and decommissioning Data nodes, Troubleshooting, and review data backups, Manage & review log files for Hortonworks.
- Adding/Installation of new components and removal of them through Cloudera.
- Monitoring workload, job performance, capacity planning using Cloudera.
- Major and Minor upgrades and patch updates.
- Creating and managing the Cron jobs.
- Installed Hadoop eco system components like Pig, Hive, Hbase and Sqoop in a CLuster.
- Experience in setting up tools like Ganglia for monitoring Hadoop cluster.
- Handling the data movement between HDFS and different web sources using Flume and Sqoop.
- Extracted files from NoSQL database likeHBase through Sqoop and placed in HDFS for processing.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Building and maintaining scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
- Installed and configured HA of Hue to point Hadoop Cluster in cloudera Manager.
- Have deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment, supporting and managing Hadoop Clusters.
- Installed and configured MapReduce, HDFS and developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Working with applications teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Extensively worked on Informatica tool to extract data from flat files, Oracle and Teradata and to load the data into the target database.
- Experienced in deploying Hadoop Cluster using automation tools like Puppet.
- Responsible for developing data pipeline using HDInsight, Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
- Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
- Commissioning Data Nodes when data grew and De-commissioning of data nodes from cluster in hardware degraded.
- Set up and managing HA Name Node to avoid single point of failures in large clusters.
- Working with data delivery teams to setup new Hadoop users, Linux users, setting up Kerberos principles and testing HDFS, Hive.
- Discussions with other technical teams on regular basis regarding upgrades, process changes, any special processing and feedback.
Environment: - Linux, Shell Scripting, Java (JDK 1.7), Tableau, Map Reduce, Teradata, SQL server, NoSQL, Cloudera, Flume, Sqoop, Chef, Puppet, Pig, Hive, Zookeeper and HBase.
Confidential, Wilmington, DE
Hadoop Administrator
Responsibilities:
- Installed, Configured, Maintained Clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop using Cloudera.
- Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java.
- Deployed a Hadoop cluster and integrated with Nagios and Ganglia.
- Extensively involved in cluster capacity planning, Hardware planning, Installation, Performance tuning of the Hadoop cluster.
- Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name node recovery, Capacity planning, Cassandra and slots configuration.
- Hands on experience in provisioning and managing multi-node Hadoop Clusters on public cloud environment Amazon Web Services (AWS) - EC2 and on private cloud infrastructure.
- Monitored multiple clusters environments using Metrics and Nagios.
- Experienced in providing security for Hadoop Cluster with Kerberos.
- Dumped the data from MYSQL database to HDFS and vice-versa using SQOOP.
- Used Ganglia and Nagios to monitor the cluster around the clock.
- Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Worked on analyzing Data with HIVE and PIG.
- Configured Zoo keeper to implement node coordination, in clustering support.
- Configured Flume for efficiently collecting, aggregating and moving large amounts of log data from many different sources to HDFS.
Environment: - HDFS, Map Reduce, Hive, Sqoop, PIG, Cloudera, Flume, SQL Server, UNIX, Redhat and CentOS.
Confidential, Denver, CO
Hadoop Dev/Admin
Responsibilities:
- Involved in development and design of a 3 node Hadoop cluster using Apache Hadoop
- Successfully implemented Cloudera on a 30 node cluster for P&G consumption forecasting.
- Involved in planning and implementation of an additional 10 node Hadoop cluster for data warehousing, historical data storage in HBase and sampling reports.
- Used Sqoop extensively to import data from RDMS sources into HDFS.
- Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
- Developed PigUDFs to pre-process data for analysis
- Worked with business teams and created Hive queries for ad hoc access.
- Responsible for creating Hive tables, partitions, loading data and writing hive queries.
- Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
- Worked on Oozie to automate job flows.
- Maintained cluster co-ordination services through ZooKeeper.
- Monitoring Hadoop scripts which take the input from HDFS and load the data into Hive
Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Java Eclipse, SQL Server, Shell Scripting.
Confidential, Omaha,NE
Linux Administrator
Roles and Responsibilities:
- Installation RHEL on Dell, Performance Tuning and Maintenance activities on DatabaseProduction systems
- Configured File Systems, users and group for Oracle and Websphere.
- Problems and performance issues, deployed latest patches for SUN Solaris and Linux Application servers, Performed RHEL Kernel Tuning for TCP stack.
- Installation, configuration and remediation of various Security software
- Installation and configuration of Netscape and Apache web server and Samba Server.
- Interaction with vendors for Hardware and software supports.
- Configuration and Administration of Veritas Cluster and MC Service Guard.
- Disk and File system management through LVM and SVM.
- Installation and configuration of Apache and Websphere Server in Linux Platform.
- Database Backup and Recovery. Performance Monitoring and Tuning
- Planning and setup of Disaster Recovery Servers.
- Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
- Trouble shooting of day to day system and user problems. Installation Packages and patches.
- Supported 24/7 high availability production servers.
ENVIRONMENT: RHEL, Solaris, VMware,Apache,NFS,DNS,SAMBA,Red Hat Linux servers, Oracle RAC,VMware,DHCP
Confidential, Houston, TX
Linux Administrator
Roles and Responsibilities:
- Installation, configuration and building RHEL servers troubleshooting and maintaining
- Software installation of Enterprise Security Manager on RHEL servers
- Configuring Send mail on RHEL servers
- Trouble shooting applications and performing system admin tasks.
- Performing system health check by using NET SNMP software on RHEL.
- Performing user accounts and groups on RHEL servers
- Patching RHEL servers for security, OS patches and upgrades
- Network troubleshooting of RHEL servers
- Load Balancing of RHEL servers
- Experience in system authentication on RHEL servers using Kerberos,LDAP
- Configuring and trouble shootingDHCP on RHEL servers
- Optimization of the performance of Linux systems
- Installing and Configuring DNS on RHEL servers
- Installation and configuration of NFS on RHEL servers
- Installing and configuring SAMBA on RHEL servers
- Scheduling Cronjobs
- Providing 24/7 On call Support
- Performing Shell and Perl Scripting for automation crucial tasks
- Trouble Shooting and Performing day to day Activities of the RHEL servers
Environment: RHEL, Solaris, VMware,Apache,JBOSS, Web Logic, System Authentication, Web sphere,NFS,DNS,SAMBA,Red Hat Linux servers, Oracle RAC,VMware,DHCP
Confidential, Colorado Springs, CO
Linux/UnixAdministrator
Roles and Responsibilities:
- Managing Disk File Systems, Server Performance, Users Creation and Granting File Access Permissions
- Troubleshooting the backup issues by analyzing the NetBackup logs
- Configuring and Troubleshooting of various services like NFS, SSH, Telnet, FTP on UNIX platform.
- Monitoring Disk, CPU and Memory & Performance of servers.
- Managing File systems and troubleshooting file systems issue.
- Configuring LVM's.
- Working with DBA, Application team to clear the logs, file system space issues
- Managing the permissions & moving files with SCP, FTP.
- User administration tasks, Permission issues.
- Monitoring the logs for Issues.
- Working on Daily reporting tasks to management about the file system, Space, Swap, CPU Usage.
- Working on Daily backup status & rescheduling the backups for failed backups.
- Working with senior team members to fix the issues.
- Rebooting the Unix/Linux Boxes after patching.
Environment: Red Hat Linux, Solaris,Windows 2003, EMC SAN, Weblogic, Windows NT/2000, Apache, Web Sphere, and JBOSS, System authentication, NFS, DNS, SAMBA