Sr. Hadoop Administrator Resume
San Jose, CA
SUMMARY:
- Around 8 years of professional experience including around 3 years of Linux Administrator and 4 plus years in Big Data analytics as Sr. Hadoop/Big Data Administrator.
- Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters, Mapr, Hortonworks & Cloudera Hadoop Distribution.
- Installation and configuration of Linux server (RadHat and Ubuntu), Managing Running Jobs, Scheduling Hadoop map reduce Jobs.
- Decommissioning and commissioning the Node on running Hadoop cluster.
- Good understanding in Microsoft Analytics Platform System (APS) HDInsight.
- Experience in managing the Hadoop infrastructure with Cloudera Manager.
- Good Understanding in Kerberos and how it interacts with Hadoop and LDAP.
- Practical knowledge on functionalities of every Hadoop daemons, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
- Experience in understanding and managing Hadoop Log Files.
- Experience in understanding hadoop multiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn.
- Experience in Adding and removing the nodes in Hadoop Cluster.
- Experience in Change Data Capture (CDC) data modeling approaches.
- Experience in managing the hadoop cluster with IBM Big Insights, Horton works Distribution Platform.
- Experience in extracting the data from RDBMS into HDFS Sqoop.
- Experience in bulk load tools such as DW Loader and move data from PDW to Hadoop archive.
- Experience in collecting the logs from log collector into HDFS using up Flume.
- Experience in setting up and managing the batch scheduler Oozie.
- Good understanding of No SQL databases such as HBase, Neo4j and Mongo DB.
- Experience in analyzing data in HDFS through Map Reduce, Hive and Pig.
- Design, implement and review features and enhancements to Cassandra.
- Experience in tuning large / complex SQL queries and manage alerts from PDW and Hadoop.
- Deployed a Cassandra cluster in cloud environment as per the requirements.
- Hands on experience in administration and maintenance of Oracle Databases Database 10g/11g/12c.
- Experience on UNIX commands and Shell Scripting.
- Experience in Python Scripting.
- Experience in statistics collection and table maintenance on MPP platforms.
- Experience in creating physical data models for data warehousing.
- Extensively worked on the ETL mappings, analysis and documentation of OLAP reports requirements. Solid understanding of OLAP concepts and challenges, especially with large data sets.
- Experience in integration of various data sources like Oracle, DB2, Sybase, SQL server and MS access and non - relational sources like flat files into staging area.
- Experience in Data Analysis, Data Cleansing (Scrubbing), Data Validation and Verification, Data Conversion, Data Migrations and Data Mining.
- Excellent interpersonal, communication, documentation and presentation skills.
TECHNICAL SKILLS:
Hadoop /Big Data Technologies: Hadoop 2.4.1,HDFS, Map Reduce, HBase, Pig, Hive, Sqoop, Yarn, Flume, Zookeeper, Spark, Cassandra,Storm, MongoDB, Pig, Hue, Impala, Whirr, Kafka Mahout and Oozie
Programming Languages: Java, SQL, PL/SQL, Shell Scripting, Python,Perl, RMAN, Oracle enterprise Manager(OEM) VMware, Vi Editor, SQL Developer
Databases: Oracle 9i/10g/11g, SQL Server, MYSQL
Database Tools: TOAD, Chordiant CRM tool, Billing tool, Oracle Warehouse Builder (OWB).
Operating Systems: Linux, Unix, Windows, Mac, CentOS
Other Concepts: OOPS, Data Structures, Algorithms, Software Engineering, ETL
PROFESSIONAL EXPERIENCE:
Confidential,San Jose,CA
Sr. Hadoop Administrator
Responsibilities:
- Handle the installation and configuration of a Hadoop cluster.
- Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
- Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
- Monitor the data streaming between web sources and HDFS.
- Worked in Kerberos and how it interacts with Hadoop and LDAP.
- Worked on kafka distributed, partitioned, replicated commit log service and provides the functionality of a messaging system.
- Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
- Commissioning and Decommissioning the Hadoop nodes & Data Re-balancing.
- Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
- Experience in a software intermediary that makes it possible for application programs to interact with each other and share data.
- Worked extensively with Amazon Web Services and Created Amazon Elastic Map Reduce cluster in both 1.0.3 and 2.2.
- Worked in Kerberos, Active Directory/LDAP, Unix based File System.
- Presented Demos to customers how to use AWS and how it is different from traditional systems.
- It's often an implementation of REST that exposes specific software functionality while protecting the rest of the application in API.
- Experience in Nagios and writing plugins for Nagios to perform the multiple server checks.
- Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
- Setting up Identity, Authentication, and Authorization.
- Maintaining Cluster in order to remain healthy and in optimal working condition.
- Handle the upgrades and Patch updates.
- Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
- Experience in architecting, designing, installation, configuration and management of Apache Hadoop, Hortonworks Distribution.
- Worked in unix commands and Shell Scripting.
- Experience in Python Scripting.
- Worked in core competencies in java, HTTP, XML and JSON.
- Worked on spark it’s a fast and general - purpose clustering computing system.
- Worked on Storm its distributed real-time computation system provides a set of general primitives for doing batch processing.
- Balancing HDFS manually to decrease network utilization and increase job performance.
- Commission and decommission the Data nodes from cluster in case of problems.
- Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
- Set up and manage High Availability Name node and Name node federation using Apache 2.0 to avoid single point of failures in large clusters.
- Experience in a Web-based Git repository hosting service,which offers all of the distributed revision control and source code management (SCM)functionality of Git as well as adding its own features in Git Hub.
- Discussions with other technical teams on regular basis regarding upgrades, Process changes, any special processing and feedback.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux.
Confidential
Hadoop Administrator
Responsibilities:- Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
- Experience in a fully managed peta byte-scale data warehouse service and it is designed for analytic workloads and connects to standard SQL-based clients and business intelligence tools in Red shift.
- Experience in architecting, designing, installation, configuration and management of Apache Hadoop, Hortonworks Distribution Platform (HDP).
- Worked in delivers fast query and I/O performance for virtually any size dataset by using columnar storage technology and parallelizing and distributing queries across multiple nodes in Red shift.
- Worked on Storm its distributed real-time computation system provides a set of general primitives for doing batch processing.
- Experience in Horton works Distribution Platform(HDP)cluster installation and configuration.
- Experience in Kerberos, Active Directory/LDAP, Unix based File System.
- Load data from various data sources into HDFS using Flume.
- Worked in statistics collection and table maintenance on MPP platforms.
- Worked on Cloudera to analyze data present on top of HDFS.
- Worked extensively on Hive and PIG.
- Worked on kafka distributed, partitioned, replicated commit log service and provides the functionality of a messaging system.
- Worked on spark it’s a fast and general - purpose clustering computing system.
- Experience in writing code in Python or Shell Scripting.
- Experience in Source Code Management tools and proficient in GIT, SVN, Accurev.
- Experience in Test Driven Development and wrote the test cases in JUnit.
- Worked on large sets of structured, semi-structured and unstructured data.
- Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Worked in bulk load tools such as DW Loader and move data from PDW to Hadoop archive.
- Participated in design and development of scalable and custom Hadoop solutions as per dynamic data needs.
- Good understanding in Change Data Capture (CDC) data modeling approaches.
- Coordinated with technical team for production deployment of software applications for maintenance.
- Good knowledge on reading data from Cassandra and also writing to it.
- Provided operational support services relating to Hadoop infrastructure and application installation.
- Handled the imports and exports of data onto HDFS using Flume and Sqoop.
- Supported technical team members in management and review of Hadoop log files and data backups.
- Participated in development and execution of system and disaster recovery processes.
- Formulated procedures for installation of Hadoop patches, updates and version upgrades.
- Automated processes for troubleshooting, resolution and tuning of Hadoop clusters.
- Set up automated processes to send alerts in case of predefined system and application level issues.
- Set up automated processes to send notifications in case of any deviations from the predefined resource utilization.
Environment: Redhat Linux/Centos 4, 5, 6, Logical Volume Manager, Hadoop, VMware ESX 5.1/5.5, Apache and Tomcat Web Server, Oracle 11,12, Oracle Rac 12c, HPSM, HPSA.
Confidential
Oracle DBA
Responsibilities:- Installation, Patching, Cloning, Upgrade, Autoconfig, AD Utilities, Platform Migration and NLS Implementation.
- Applying patches to the 11i and R12 instance.
- Setting up cron jobs for monitoring the database.
- Installation of E-Business suit R12 for the Dev/UAT/CRP1 instance.
- Installation of Single node/ Multi node on Linux,Solaris
- Experience in Reconstruction of databases in development environment.
- Installation of software, creation databases and decommission of databases.
- Experienced in upgrading the databases and applying patches.
- Managing the tablespaces(adding, resizing the tablespaces).
- Checking CRON Jobs and rescheduling process and monitoring.
- Creating the users and giving necessary privileges as per the application request.
- Knowledge on administration of RAC databases.
- Involved in cluster failure issue in RAC environment.
- Performed reorganization of tablespaces and refresh of databases.
- As part of improving the db performance, reconstructed a critical database.
- Proficient in renaming the users as per the application users request.
- Generating the AWR and statspack reports to diagnose the database performance.
- Managing the export, hot and RMAN backups.
- Implemented backup/restore procedures in ARCHIVELOG mode
- Knowledge on administration of RAC databases.
- Involved in cluster failure issue in RAC environment.
- Performed reorganization of tablespaces and refresh of databases.
- As part of improving the db performance, reconstructed a critical database.
- Proficient in renaming the users as per the application users request.
- Generating the AWR and statspack reports to diagnose the database performance.
- Managing the export, hot and RMAN backups.
- Implemented backup/restore procedures in ARCHIVELOG mode.
Environment: Linux, Solaria, Hp-Unix & Aix
Confidential
Linux Administrator
Responsibilities:
- Installing and upgrading OE & Red hat Linux and Solaris 8/9/10 x86 & SPARC on Servers like HP DL 380 G3, 4 and 5 & Dell Power Edge servers.
- Experience in LDOM’s and Creating sparse root and whole root zones and administered the zones for Web, Application and Database servers and worked on SMF on Solaris 10.
- Experience working in AWS Cloud Environment like EC2 & EBS.
- Implemented and administered VMware ESX 3.5, 4.x for running the Windows, Centos, SUSE and Red hat Linux Servers on development and test servers.
- Installed and configured Apache on Linux and Solaris and configured Virtual hosts and applied SSL certificates.
- Implemented Jumpstart on Solaris and Kick Start for Red hat environments.
- Experience working with HP LVM and Red hat LVM.
- Experience in implementing P2P and P2V migrations.
- Involved in Installing and configuring Centos & SUSE 11 & 12 servers on HP x86 servers.
- Implemented HA using Red hat Cluster and VERITAS Cluster Server 5.0 for Web Logic agent.
- Managing DNS, NIS servers and troubleshooting the servers.
- Troubleshooting application issues on Apache web servers and also database servers running on Linux and Solaris.
- Experience in migrating Oracle, MYSQL data using Double take products.
- Used Sun Volume Manager for Solaris and LVM on Linux & Solaris to create volumes with layouts like RAID 1, 5, 10, 51.
- Re-compiling Linux kernel to remove services and applications that are not required.
- Performed performance analysis using tools like prstat, mpstat, iostat, sar, vmstat, truss, Dtrace.
- Experience working on LDAP user accounts and configuring ldap on client machines.
- Upgraded Clear-Case from 4.2 to 6.x running on Linux (Centos &Red hat)
- Worked on patch management tools like Sun Update Manager.
- Experience supporting middle ware servers running Apache, Tomcat and Java applications.
- Worked on day to day administration tasks and resolve tickets using Remedy.
- Used HP Service center and change management system for ticketing.
- Worked on the administration of the Web Logic 9, JBoss 4.2.2 servers including installation and deployments.
- Worked on F5 load balancers to load balance and reverse proxy Web Logic Servers.
- Shell scripting to automate the regular tasks like removing core files, taking backups of important files, file transfers among servers.
Environment: Solaris 8/9/10,Hp-Unix, Linux, & Aix Server Veritas Volume Manager, web servers, LDAP directory, Active Directory, BEA Web logic servers, SAN Switches, Apache, Tomcat servers, Web Sphere application server.