Sr. Bigdata Administrator Resume Franklin, TN - Hire IT People

SUMMARY:

Around 8 years of IT experience including 3 + years in Big Data Technologies.
Extensive experience in Designing, Installing, Configuring and Tuning Hadoop core and ecosystem components
Well versed with Hadoop Map Reduce, HDFS, Pig, Hive, HBase, Sqoop, Flume, Yarn, Zookeeper, Spark and Oozie
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Name Node, Job Tracker, Data Node, Task Tracker and Map Reduce concepts.
Experience in installation, configuration, support and management of a Hadoop Cluster.
Experience in task automation using Oozie, cluster co - ordination through Pentaho and Map Reduce job scheduling using Fair Scheduler.
Worked on both Hadoop distributions: Cloudera and Hortonworks
Experience in performing minor and major upgrades and applying patches for Ambari and Cloudera Clusters
Strong Knowledge in Hadoop Cluster Capacity Planning, Performance Tuning, Cluster Monitoring.
Capability to configure scalable infrastructures for HA (High Availability) and disaster recovery
Experience in analyzing data using HiveQL, Pig Latin.
Experience in developing and scheduling ETL workflows, data scrubbing and processing data in Hadoop using Oozie.
Experience in balancing the cluster after adding/removing nodes or major data cleanup.
Experience in Setting up Data Ingestion tools like Flume, Sqoop, and NDM
General Linux system administration including design, configuration, installs, automation.
Experience on Oracle, Hadoop, MongoDB, AWS Cloud, Greenplum.
Experience in configuring Zookeeper to coordinate the servers in clusters.
Strong Knowledge in using NFS (Network File Systems) for backing up Name node metadata
Experience in setting up Name Node high availability for major production cluster
Experience in designing Automatic failover control using zookeeper and quorum journal node
Experience in creating, building and managing public and private cloud Infrastructure
Experience in working with different file formats and compression techniques in Hadoop
Experience in analyzing existing Hadoop cluster, Understanding the performance bottlenecks and providing the performance tuning solutions accordingly.
Extensive experience in installation, configuration, maintenance, design, development, implementation, and support on Linux.
Experience in Ansible and related tools for configuration management.
Experience in working large environments and leading the infrastructure support and operations.
Migrating applications from existing systems like MySQL, oracle, db2 and Teradata to Hadoop.
Benchmarking Hadoop clusters to validate the hardware before and after installation to tweak the configurations to obtain better performance.
Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster.
Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.

TECHNICAL SKILLS:

Big Data Technologies: HDFS, Hive, Map Reduce, Cassandra, Pig, Hcatalog, Phoenix, Falcon, Scoop, Flume, Zookeeper, Mahout, Oozie, Avro, HBase, MapReduce, HDFS, Storm, HDP 2.4, HDP 2.5.

Monitoring Tools: Ambari, Cloudera manager, Ganglia, Nagios, Cloud watch.

Scripting Languages: Shell Scripting, Puppet, Scripting, Python, Bash, CSH, Ruby, PHP

Programming Languages: C, Java, SQL, and PL/SQL.

Front End Technologies: HTML, XHTML, XML.

Application Servers: Apache Tomcat, WebLogic Server, Web sphere

Databases: Oracle 11g, MySQL, MS SQL Server, IBM DB2.

NoSQL Databases: HBase, Cassandra, MongoDB

Operating Systems: Linux, UNIX, MAC, Windows NT / 98 /2000/ XP / Vista, Windows 7, Windows 8.

Networks: HTTP, HTTPS, FTP, UDP, TCP/TP, SNMP, SMTP.

Security: Kerberos, Ranger, Rangerkms, Knox.

WORK EXPERIENCE:

Sr. Bigdata Administrator

Confidential

Responsibilities:

Installed, configured, upgraded, and applied patches and bug fixes for Prod, Test and Dev Servers.
Installed/Configured/Maintained Hadoop clusters in dev/test/UAT/Prod environments.
Install, configure and administer Hdfs, Hive, Ranger, Pig, HBase, Oozie, Sqoop, Spark and Yarn.
Worked on the installation and configuration of Hadoop HA Cluster.
Involved in capacity planning and design of Hadoop clusters.
Setting up alerts in Ambari for the monitoring of Hadoop Clusters.
Setting up security authentication using Kerberos security.
Creating and dropping of users, granting and revoking permissions to users/Policies as and when required using Ranger.
Commission and decommission the data nodes from cluster.
Write and modify UNIX shell scripts to manage HDP environments.
Installed and configured Flume, Hive, Sqoop and Oozie on the Hadoop cluster.
Administer, configure and performance tuning for Spark applications
Create directories and setup appropriate permissions for different applications.
Backup tables in Hbase to Hdfs dir.’s using export utility.
Involved in planning and implementation of Hadoop cluster Upgrade.
Installation, Configuration and administration of HDP on Red Hat Enterprise Linux 6.6
Used Sqoop to import data into HDFS from Oracle database.
Detailed analysis of system and application architecture components per functional requirements.
Review and monitor system and instance resources to insure continuous operations (i.e., database storage, memory, CPU, network usage, and I/O contention)
On call support for 24x7 Production job failures and resolve the issue in timely manner.
Developed UNIX scripts for scheduling the delta loads and master loads using Auto sys Scheduler.
Have deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment.
Troubleshoots with problems regarding the databases, applications and development tools.

Technical Environment: Hortonworks 2.4/2.5, Ambari, HDP, Ranger, HUE, Sqoop, Kerberos, Hive, Informatica 9.6.1, Oracle 11g/10g, DB2, LINUX, AWS, UNIX - AIX, Autosys.

Bigdata Administrator

Confidential, Franklin, TN

Responsibilities:

Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.
Played responsible role for deciding the hardware configurations for the cluster along with other teams in the company.
Resolving tickets submitted, P1 issues, troubleshoot the error documenting, resolving the errors.
Adding new Data Nodes when needed and running balancer.
Responsible for building scalable distributed data solutions using Hadoop.
Continuous monitoring and managing the Hadoop cluster through Ganglia and Nagios.
Done major and minor upgrades to the Hadoop cluster.
Done stress and performance testing, benchmark for the cluster.
Working closely with both internal and external cyber security customers.
Research effort to tightly integrate Hadoop and HPC systems.
Compared Hadoop to commercial big-data appliances from Netezza, XtremeData and LexisNexis. Published and presented results.
Research effort to tightly integrate Hadoop and HPC systems.
Deployed, and administered 300+ nodes Hadoop cluster. Administered two bigger clusters.
Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.
Deployed, and administered Hadoop clusters.
Compared Hadoop to commercial big-data appliances from Netezza, XtremeData, and LexisNexis. Published and presented results.
Worked on developing Linux scripts for Job Automation.
Resolving tickets submitted by users, P1 issues, troubleshoot the error documenting, resolving the errors.
Developing machine-learning capability via Apache Mahout.

Technical Environment: Hadoop, MapReduce, HDFS, Hive, Oracle 11g, Java, Struts, Servlets, HTML, XML, SQL, J2EE, Tomcat 6.

Hadoop Administrator

Confidential, Palo Alto, California

Responsibilities:

Involved in Hadoop Implementation project specializing in but not limited to Hadoop Cluster management, write MapReduce Programs, Hive Queries (HQL) and used Flume to analyze the log files.
Installation, Configuration and Management of Bigdata 2.x cluster (Hortonworks Data Platform1.3/2.2).
Involved and played a key role in the development of an ingestion framework based on Oozie and another framework using Java.
Developed Data Quality checks to match the ingested data with source in RDBMS using Hive.
Did a POC for data ingestion using ETL tools including Talend, DataStage and ToadDataPoint.
Created some custom component detection and monitoring using Zookeeper APIs.
Supported Bigdata 1.x cluster (HDP 1.3) with issues related to jobs and cluster-utilization.
Deep understanding and related experience with Hadoop, HBase, Hive, and YARN/Map-Reduce.
Enabled High-Availability for NameNode and setup fencing mechanism for split-brain scenario.
Enabled High-Availability for Resource Manager and several ecosystem components including Hiveserver2, Hive Metastore, and HBase.
Configured YARN queues - based on Capacity Scheduler for resource management.
Configured Node-Labels for YARN to isolate resources at a node level - separating the nodes specific for YARN applications and HBASE separately.
Configured CGroups to collect CPU utilization stats.
Setup Bucket Cache on HBASE specific slave nodes for improved performance.
Setup rack-awareness for the Bigdata cluster and setup rack-topology script for improved fault tolerance.
Setup HDFS ACLs to restrict/enable access to HDFS data.
Performance evaluation for Hadoop/YARN - TestDFSIO, Terasort.
Performance evaluation of Hive 14 with Tez using HiveTestBench.
Configured Talend, Data Stage and Toad Data Point for ETL activities on Hadoop/Hive databases.
Backing up HBase data and HDFS using HDFS Snapshots and evaluated the performance overhead.
Created several recipes for automation of configuration parameters/scripts using Chef.
Management and configured retention period of log files for all the services across the cluster.
Involved in development of ETL processes with Hadoop, YARN and Hive.
Developed Hadoop monitoring processes (capacity, performance, consistency) to assure processing issues are identified and resolved swiftly.
Coordinate with Operation/L2 team for knowledge transfer.
Setting up quotas and replication factor for user/group directories to keep the disk usage under control using HDFS quotas.

Technical Environment: Hadoop 1.2.1, Map Reduce, Hive 0.10.0,Pig 0.11.1,Oozie 3.3.0, H base 0.94.11,Sqoop1.4.4, Flume 1.4.0, Java, SQL, PL/SQL, Oracle 10g, Eclipse HTTP, Jama.

Hadoop Administrator

Confidential, Santa Clara, CA

Responsibilities:

End to End Migration of Hadoop 1.x to Hadoop 2.x.
Perform the installation and configuration of a Hadoop cluster using Ambari 2.0 (Hortonworks HDP 2.2)
Monitor health of Hadoop ecosystem (HDFS, YARN, HIVE, HBASE, Sqoop, Hue and Slider)
Monitor disk, Memory, Heap, CPU utilization on all Master and Slave machines and took necessary measures to keep the cluster up and running on 24/7 basis.
Configured Capacity Scheduler to provide service-level agreements for multiple users of a cluster.
Experience in performing benchmark tests for Hive, HBASE, HDFS, NameNode benchmark, MapReduce benchmark
Created Hive internal and external tables defined with appropriate static and dynamic partitions, intended for efficiency.
Worked on installing cluster, commissioning & decommissioning of data nodes.
Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for NameNode, Resource Manager, HiveServer2, and HBase Master.
Implemented Kerberos for Hadoop Security.
Creating and deploying Kerberos key tab Files, creating principals, realm.
Imported logs from web servers with Flume to ingest the data into HDFS.
Creating and managing HBase clusters dynamically using Slider.
Upgrading Apache Ambari from Version 1.7 to 2.0.
Involved in collecting metrics for Hadoop clusters using Ganglia.
Interacted with developers when we had to deploy new jobs, Jobs throwing exceptions, and Data related issues.

Technical Environment: MapReduce, HDFS, Pig, Hive, HBase, Flume, Slider, Sqoop, Oozie, Nagios Zookeeper, Ganglia and Hortonworks Ambari.

Linux system administrator

Confidential

Responsibilities:

Designing and configuring of Red Hat Enterprise Linux 5/4/3 User Administration, management and archiving.
Expertise in Performance Monitoring and Performance Tuning using Top, prstat, SAR, vmstat, ps, iostat etc.
Expertise in package management involves creating, installing and configuring of packages using Red Hat RPM.
Used RPM in several Linux distributions such as Red Hat Enterprise Linux, SUSE Linux Enterprises and Fedora.
Extensive experience in building servers using Jumpstart and Kick-start Process.
Worked in configuration of Redhat Satellite server and Managed, configured and Maintained customer entitlements including upgrading and patching of Linux servers
Used RHN Satellite exporter command to channel all the packages and deploying rpm packages.
Configuring NFS, NIS, DNS, Auto Mount & Disk Space Management on SUN Servers.
Experience in Configuring and Managing SAN Disks, Disk Mirrors & RAID 0, 1 & 5 Levels.
Maintained and modified hardware and software components, content and documentation.
Provided guidance for equipment checks and supported processing of security requests.
Created/expand file systems in Linux (volume groups and Logical Volumes) and Solaris using volume managers
Experience in backup and restore operations with Linux Inbuilt utilities.
Installed configured and troubleshoot RedHat Linux 6.0 issues.
Configured Users & Security administration, backup, recovery and maintenance of various activities.
Advance level of experience in researching OS Issues, Applying Patches and opening the vendor Tickets.
Maintained and documented all the errors and logs that were new to the environment shared it with the team.
Installed and monitored VMware Virtual environments withESX4ESX 3.x, ESXi servers & Virtual Center 2.X.
SENDMAIL configurations and administrations, testing the mail servers.
Experience in adding and configuring devices like hard disks and backup devices etc.
Monitor daily Net Backup activity and reports to proactively avoid issues.
Improve automation and productivity of backups through scripting enhancements.
Modify and Optimize backup schedules.
Used VMware for testing various applications on different operating system.
Provided 24x7 supports on pager rotation basis.

Environment: RHEL 5/4/3, SUSE, LVM, VMware, RAID

We provide IT Staff Augmentation Services!

Sr. Bigdata Administrator Resume

Franklin, TN

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship