Lead Hadoop system administrator Resume

CAREER OBJECTIVE:

Over 8 years information technology and 6 years as oracle database administrator2 years in Hadoop administration with Cloudera manager and Ambari.

PROFESSIONAL SUMMARY:

2 years’ experience in installation and configuration of Hadoop ecosystem like Hdfs, Hive, Impala, Yarn, HBase, Sqoop, Flume, Oozie and Spark, Kafka
Well - versed with Hadoop distribution in Cloudera's CDH & Hortonworks HDP
Knowledge of NameNode High Availability, Cluster planning and MapReduce framework, Configuring Schedulers as needed, Configure local CDH and HDP repository.
Experience in Benchmarking, Commissioning and decommissioning of data nodes on Hadoop cluster
Hands on experience in importing and exporting data from RDBMS like Oracle, MySQL into HDFS and Hive Warehouse using Sqoop. Also with other data ingestion like flume and kafka
Hands on backup and recovery using Snapshot methods, distcp, and Cloudera BDR methodologies.
Hands-on experience in configuration and management of security for Hadoop cluster using Kerberos, Sentry in cloudera, Ranger in hortonworks, HDFS ACLs and Data Encryption methodology as recommended by both Cloudera and Hortonworks
Knowledge of scripting tools such as bash shell scripts.
Experience in supporting L1, L2 issues on Hadoop production cluster.
Experience in supporting systems with 24X7 availability and monitoring.
Experience working in cross-functional, multi-location teams.
5+ years Oracle DBA experience of daily support of corporate management systems including database management and administration, analysis, design and architecture in client/server and web-based environments with reference to Oracle
Experience in Oracle 11g/10g/9i DBA duties in high availability production environment
Strong experience with Installing, configuring, creating, supporting and managing Oracle Database systems
Knowledge of 11gR2 Real Application Cluster (RAC)
Experience with installing and Managing RAC Configuration
Experience with installing and configuring oracle Grid infrastructure on Linux
Experience in managing RAC database using CRSCTL, SRVCTL and Grid Control, use of data guard broker to manage primary and standby database in dataguard
4+ years’ experience writing SQL, batch and Shell scripts for backups, SQL*Loader, export and import of database
Adept in providing custom Oracle server and client installation
Knowledge of PL/SQL
Strong experience with performance monitoring/proactive and reactive tuning/troubleshooting of Oracle systems running off: UNIX, Linux, and Windows NT/2000/2003 servers
System resource management including bottleneck detection, contention tuning on SGA, CPU, Memory and I/O
Ability to prioritize and meet operational deadlines in fast paced environment with good stress management skills
Perform daily monitoring of Oracle instances, running Statspack, AWR reports to monitor tablespaces, memory structures, undo segments, logs, and alerts
Expert in systems analysis and architecture, capacity planning, backup/recovery, installation, configuration, apply patches, troubleshooting, performance monitoring and tuning
Strong experience with design, development, and testing of backup and recovery to ensure complete recoverability Experience in Data Guard
Experience in RMAN Backup and Recovery
Solid experience in disaster recovery, routine maintenance for Oracle databases
Security management for database, network, and operating systems
2 solid years of managing database with RAC
Maintains and administers Greenplum databases and servers in an AWS environment

TECHNICAL SKILLS:

Cloudera Hadoop Ecosystems: HDFS, Hive, Impala, Yarn, Spark HBase, Sqoop, Flume, Oozie, Pig, Hue, Sentry

Hortonworks Hadoop Ecosystems: HDFS, HIVE, YARN, HBASE, Sqoop, Ranger, Kafka

Distribution platforms: Cloudera manager (5.9, 5.12, 5.14) Hortonworks-Ambari (2.5, 2.6) Oracle BDA 4.7, 4.9

Database: Oracle, PostgreSQL, Mysql

Network: TCP/IP, HTTP/HTTPS, SSH, FTP

OS: Linux - RHE, Centos 6 & 7

Monitoring tools: Cloudera and Ambari manager, Ganglia, Oracle ILOM & OEM cloud control

Security: Knox, Ranger, Sentry, HDFS ACLs & Kerberos

Cloud System: Google Cloud & AWS Cloud

Automation & continuous integration tool: Jenkins, Ansible & GitHub

WORK EXPERIENCE:

Confidential

Lead Hadoop system administrator

Responsibilities:

Deployment of Hadoop version 2 using Cloudera Distribution Hub on Pre-production and QA cluster and also Various Hadoop services such as HDFS, YARN, Map-Reduce, Sqoop, Flume, Pig, Hive, Zookeeper, Oozie, Kafka, Storm, HBase, Spark.
Enabled NameNode High Availability on Pre-production cluster.
Benchmarked the cluster using TestDFSIO, TeraSort and Teragen for measuring HDFS I/O (read/write) performance and to measures time for sorting data with MapReduce.
Set up production Hadoop clusters with optimum configurations.
Experience in installing and setting up Kerberos.
Worked with data delivery teams to setup new Hadoop users, Create Kerberos principal and testing HDFS, Hive, Pig and MapReduce access for the new users
Provided support for data analysts, Pig and Hive developers.
Troubleshoot the issues by analyzing various log files and raise tickets with Cloudera support if needed.
Automated various tasks using shell scripts.
Installed and configured CDH 5.13.0 cluster, using Cloudera manager.
Implemented automatic failover zookeeper and zookeeper failover controller.
Developed scripts for benchmarking with Teragen, Terasort & Teravalidate
Monitored multiple Hadoop clusters environments using Ganglia and Nagios.
Monitored workload, job performance and capacity planning.
Managing and reviewing Hadoop log files and debugging failed jobs.
Supported cluster maintenance, Backup and recovery for production cluster.
Backed up data on regular basis to a remote cluster using distcp
Fine tuning of Hive jobs for better performance.
Collected and aggregated large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
Implemented Fair scheduler and capacity scheduler to allocate fair amount of resources to small jobs.

Environment: HDFS 2.6.0, Mapreduce 2.6.0, Hive 1.1.0, Yarn 2.6.0, Hbase 1.2.0, Sqoop 1.4.6, Oozie 4.1.0, Zookeeper 3.4.5, Kerberos 0.8

Confidential, Dallas Tx

Lead Hadoop administrator

Responsibilities:

Deployment of Hadoop version 2 using Cloudera Distribution Hub on Pre-production and QA cluster and also Various Hadoop services such as HDFS, YARN, Map-Reduce, Sqoop, Flume, Pig, Hive, Zookeeper, Oozie, Kafka, Storm, HBase, Spark.
Enabled NameNode High Availability on Pre-production cluster.
Benchmarked the cluster using TestDFSIO, TeraSort and Teragen for measuring HDFS I/O (read/write) performance and to measures time for sorting data with MapReduce.
Set up production Hadoop clusters with optimum configurations.
Experience in installing and setting up Kerberos.
Worked with data delivery teams to setup new Hadoop users, Create Kerberos principal and testing HDFS, Hive, Pig and MapReduce access for the new users
Provided support for data analysts, Pig and Hive developers.
Troubleshoot the issues by analyzing various log files and raise tickets with Cloudera support if needed.
Automated various tasks using shell scripts.
Installed and configured CDH 5.13.0 cluster, using Cloudera manager.
Implemented automatic failover zookeeper and zookeeper failover controller.
Developed scripts for benchmarking with Teragen, Terasort / Teravalidate
Monitored multiple Hadoop clusters environments using Ganglia and Nagios.
Monitored workload, job performance and capacity planning.
Managing and reviewing Hadoop log files and debugging failed jobs.
Supported cluster maintenance, Backup and recovery for production cluster.
Backed up data on regular basis to a remote cluster using distcp
Fine tuning of Hive jobs for better performance.
Collected and aggregated large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
Implemented Fair scheduler and capacity scheduler to allocate fair amount of resources to small jobs.

Environment: HDFS 2.6.0,Mapreduce 2.6.0,Hive 1.1.0, Yarn 2.6.0, Hbase 1.2.0,Sqoop 1.4.6, Oozie 4.1.0, Zookeeper 3.4.5, Kerberos 0.8, BDA 4.7, 4.9

Confidential

Hadoop Administrator

Responsibilities:

Deployment of Hadoop version 2 using Cloudera Distribution Hub on Pre-production and QA cluster and also Various Hadoop services such as HDFS, YARN, Map-Reduce, Sqoop, Flume, Pig, Hive, Zookeeper, Oozie, Kafka, Storm, HBase, Spark.
Enabled NameNode High Availability on Pre-production cluster.
Benchmarked the cluster using TestDFSIO, TeraSort and Teragen for measuring HDFS I/O (read/write) performance and to measures time for sorting data with MapReduce.
Set up production Hadoop clusters with optimum configurations.
Experience in installing and setting up Kerberos.
Worked with data delivery teams to setup new Hadoop users, Create Kerberos principal and testing HDFS, Hive, Pig and MapReduce access for the new users
Provided support for data analysts, Pig and Hive developers.
Troubleshoot the issues by analyzing various log files and raise tickets with Cloudera support if needed.
Automated various tasks using shell scripts.
Installed and configured CDH 5.13.0 cluster, using Cloudera manager.
Implemented automatic failover zookeeper and zookeeper failover controller.
Developed scripts for benchmarking with Teragen, Terasort & Teravalidate
Monitored multiple Hadoop clusters environments using Ganglia and Nagios.
Monitored workload, job performance and capacity planning.
Managing and reviewing Hadoop log files and debugging failed jobs.
Supported cluster maintenance, Backup and recovery for production cluster.
Backed up data on regular basis to a remote cluster using distcp
Fine tuning of Hive jobs for better performance.
Collected and aggregated large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
Implemented Fair scheduler and capacity scheduler to allocate fair amount of resources to small jobs.

Environment: HDFS 2.6.0, Mapreduce 2.6.0, Hive 1.1.0, Yarn 2.6.0, Hbase 1.2.0, Sqoop 1.4.6, Oozie 4.1.0, Zookeeper 3.4.5, Kerberos 0.8

Confidential

Oracle DBA

Responsibilities:

Deployment of Hadoop version 2 using Cloudera Distribution Hub on Pre-production and QA cluster and also Various Hadoop services such as HDFS, YARN, Map-Reduce, Sqoop, Flume, Pig, Hive, Zookeeper, Oozie, Kafka, Storm, HBase, Spark.
Enabled NameNode High Availability on Pre-production cluster.
Benchmarked the cluster using TestDFSIO, TeraSort and Teragen for measuring HDFS I/O (read/write) performance and to measures time for sorting data with MapReduce.
Set up production Hadoop clusters with optimum configurations.
Experience in installing and setting up Kerberos.
Worked with data delivery teams to setup new Hadoop users, Create Kerberos principal and testing HDFS, Hive, Pig and MapReduce access for the new users
Provided support for data analysts, Pig and Hive developers.
Troubleshoot the issues by analyzing various log files and raise tickets with Cloudera support if needed.
Automated various tasks using shell scripts.
Installed and configured CDH 5.10.0 cluster, using Cloudera manager.
Implemented automatic failover zookeeper and zookeeper failover controller.
Developed scripts for benchmarking with Terasort / Teragen.
Monitored multiple Hadoop clusters environments using Ganglia and Nagios.
Monitored workload, job performance and capacity planning.
Managing and reviewing Hadoop log files and debugging failed jobs.
Supported cluster maintenance, Backup and recovery for production cluster.
Backed up data on regular basis to a remote cluster using distcp
Fine tuning of Hive jobs for better performance.
Collected and aggregated large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks.
Implemented Fair scheduler and capacity scheduler to allocate fair amount of resources to small jobs.

Environment: HDFS 2.6.0, Mapreduce 2.6.0,Hive 1.1.0, Yarn 2.6.0, Hbase 1.2.0,Sqoop 1.4.6, Oozie 4.1.0, Zookeeper 3.4.5, Kerberos 0.8

Confidential

Production Oracle DBA

Responsibilities:

Migration of databases from 11.2.0.2/10.2.02 to 11.2.0.3
Migration of databases from standalone server to Exadata database Machine
Refresh database using RMAN and datapump export/import
Work with developer to configure and install informatica HUBS
Generate scripts to automate processes
Create RAC database using DBCA
Manage database Using TOAD, OEM 12c and SQL*Developer
Configure and setup email notification in OEM 12cCreate metric template and assigned it to target in OEM 12c
Discover and promote data target in OEM 12c
Create blackout for database using OEM 12c to cover maintenance period
Monitor database performance and resolve performance issues
Run AWR/ADDM to find course of issues and recommendation
Troubleshoot and resolved user error and handle request from developers
Convert single instance database to RAC with ASM
Clone database and upgrade to 11.2.0.3
Monitors database activity and file usage, and ensures necessary resources are present so that databases function properly by removing or deleting old or obsolete files

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship