Hadoop Administrator Resume
Boston, MA
EXPERIENCE SUMMARY:
- 8+ years overall experience in Systems Administration, Application Administration in diverse industries.
- Over 2 years of comprehensive experience as a Hadoop Administrator.
- Good understanding/knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, Hive, Sqoop, Pig, HDFS, Hbase, ZooKeeper, Oozie, and Flume.
- Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions: Cloudera CDH, Apache Hadoop and Hortonworks.
- Well versed in installing, upgrading & managing Apache, Cloudera (CDH4) and Hortonworks distributions for Hadoop.
- Experience with Installed Hadoop patches, major and minor version upgrades of Hadoop Apache, Cloudera and Hortonworks distributions.
- Experience monitoring workload, job performance and collected metrics for Hadoop cluster when required using Ganglia, Nagios and Cloudera Manager.
- Advanced knowledge in troubleshooting operations issues, finding root causes faced in managing Hadoop clusters.
- Analyzed large data sets using Hive queries and Pig Scripts.
- In - depth understanding of Data Structures and Algorithms.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
- Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
- Experience in developing solutions to analyze large data sets efficiently
- Experience in Data Warehousing and ETL processes.
- Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper
- Involved in Performance tuning activity in different levels. For optimizing Map Reduce programs which run on large datasets, that are written once, executed over and over.
- Experience in designing and building a Highly Available (HA) and Fault Tolerant (FT) distributed systems.
- Experience in developing ETL process using Hive, Pig, Sqoop and Map-Reduce Framework.
- Experience in troubleshooting, finding root causes, Debugging and automating solutions for operational issues in the production environment
- Ability to learn and adapt quickly and to correctly apply new tools and technology.
- Extensive experience in creating Class Diagrams, Activity Diagrams, Sequence Diagrams using Unified Modeling Language(UML)
- Co-ordination with software development team, project implementation, analysis, technical support, data Conversion and deployment.
- Worked extensively on Hardware capacity management and procurement process.
- Skilled technologist with deep expertise in aligning solutions against strategic roadmaps to support business goals.
- Expert communicator, influencer, and motivator who speaks the language of business and technology, resulting in excellent rapport with fellow employees, peers and executive management.
- Good problem-solving skills, quick learner, effective individual and team player, excellent communication and presentation skills.
TECHNICAL SKILLS:
Hadoop: HDFS, YARN, Map-Reduce, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper and Data meter
Databases: Oracle, PARAccel and Microsoft SQL server, Cassandra
Markup Languages: HTML, CSS, DHTML, and XML.
Application Servers: Apache Tomcat server, Apache HTTP webserver
Operating Systems: Redhat Linux 5.8(Tikanga), 6.3(Santiago), UNIX, Windows.
Scripting Lang: Shell, Java Script
Languages: C, SQL, PLSQL, Java, PHP.
PROFESSIONAL EXPERIENCE:
Confidential, Boston, MA
Hadoop Administrator
Responsibilities:
- Worked with various Networking, SA, Security teams during design and implementation phase of Big Data Clusters.
- Designed and implemented Unix level security model for Big Data Cluster.
- Designed and implemented HDFS level security.
- Successfully performed installation of CDH4 - Cloudera’s Distribution including Apache Hadoop through Cloudera manager.
- Successfully performed installation of CDH3 - Cloudera’s Distribution including Apache Hadoop through Cloudera manager.
- Successfully performed deployment of Hadoop ecosystem.
- Successfully performed installation of Analytics tool Datameer.
- Performed Hadoop clusters administration through Cloudera Manager.
- Worked on Hadoop clusters capacity planning and management.
- Worked on Hadoop clusters Filesystem planning and management.
- Worked with SA/security team to enable Kerberos on Big Data Cluster, create and deploy Kerberos principals and keytab files.
- Worked on HUE security setup, integrated HUE with PAM backend for authentication/authorization.
- Worked on HUE integration with LDAP for authentication/authorization.
- Worked on Cloudera manager integration with LDAP for authentication/authorization.
- Worked on Datameer integration with LDAP for authentication/authorization.
- Worked on Audit setup for activity on HDFS, entitlement reviews.
- Worked on monitoring setup for Hadoop ecosystem components.
- Performed Hadoop cluster monitoring and troubleshooting.
- Proficient in Map-Reduce, shell scripting.
- Worked on establishing Operational/Governance model and Change Control Board for various lines of business running on Big Data Clusters.
Confidential, New York NY
Hadoop Big Data Engineer
Responsibilities:
- Installed and Setup Hadoop HDP 1.3 clusters for development and production environment.
- Installed and configured Hive, Pig, Sqoop, Flume, Ambari and Oozie on the Hadoop cluster
- Planning for production cluster hardware and software installation on production cluster and communicating with multiple teams to get it done.
- Migrated data from Hadoop to AWS S3 bucket using DISTCP. Also migrated data across new and old clusters using DISTCP.
- Implemented HDFS snapshot feature
- Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and collected metrics for Hadoop cluster when required.
- Installed Hadoop patches, updates and version upgrades when required
- Installed and configured Ambari, Hive, Pig, Sqoop and Oozie on the HDP 2.0 cluster.
- Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
- Performed a Major upgrade in development environment from HDP 1.3 to HDP 2.0.
- Worked with big data developers, designers and scientists in troubleshooting map reduce, hive jobs and tuned them to give high performance.
- Design & Develop ETL workflow using Oozie for business requirements which includes automating the extraction of data from MySQL database into HDFS using Sqoop scripts.
- Automated end to end workflow from Data preparation to presentation layer for Artist Dashboard project using Shell Scripting.
- Developed Map reduce program were used to extract and transform the data sets and result dataset were loaded to Cassandra.
- Created Hive Tables, loaded WWS data (worldwide sales) transactional data from Oracle using Sqoop and loaded the processed data into ParAccel database.
- Orchestrated Sqoop scripts, pig scripts, hive queries using Oozie workflows and sub-workflows
- Validating the data using the swagger API URL by providing the key for a corresponding column family.
- Used Maven for continuous build integration and deployment.
- Conducting RCA to find out data issues and resolve production problems.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Performed data analytics in Hive and then exported this metrics back to Oracle Database using Sqoop.
- Involved in Minor and Major Release work activities.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Collaborating with business users/product owners/developers to contribute to the analysis of functional requirements.
Technologies: Horton works Distributions (HDP1.3 & HDP 2.0), Hive, Pig Latin, Ambari, Nagios, Ganglia, Cassandra, Tableau and ParAccel.
Confidential, Atlanta, GA
Hadoop Big Data Engineer
Responsibilities:
- Worked with technology and business groups for Hadoop migration strategy.
- Research and recommend suitable technology stack for Hadoop migration considering current enterprise architecture.
- Installed and configured various components of Hadoop ecosystem and maintained their integrity.
- Validated and Recommended on Hadoop Infrastructure and data center planning considering data growth.
- Worked with Linux server admin team in administering the server hardware and operating system.
- Monitored Hadoop cluster, workload, job performance environments using Data dog and Cloudera Manager.
- Transferred data to and from cluster, using Sqoop and various storage media such as Informix tables and flat files.
- Developed MapReduce programs and Hive queries to analyze sales pattern and customer satisfaction index over the data present in various relational database tables.
- Worked extensively in performance optimization by adopting/deriving at appropriate design patterns of the MapReduce jobs by analyzing the I/O latency, map time, combiner time, reduce time etc.
- Developed Pig scripts in the areas where extensive coding needs to be reduced.
- Performed data analytics in Hive and then exported this metrics back to Oracle Database using Sqoop.
- Perform maintenance, monitoring, deployments, and upgrades across infrastructure that supports all our Hadoop clusters.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive
- Debugging and troubleshooting the issues in development and Test environments.
- Conducting root cause analysis and resolve production problems and data issues.
- Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
- Document and manage failure/recovery (loss of name node, loss of data node, replacement of HW or node)
- Followed agile methodology for the entire project.
Technologies: Hive, Cloudera Hadoop Distribution (CDH4), Cloudera Manager, Puppet, Ganglia and Nagios.
Confidential, San Diego
Sr. Oracle DBA
Responsibilities:
- Installed Oracle 9i and 10g, RAC database on HP-UX 11.0, Redhat4
- Installed RAC on Oracle 10g and worked on ASM feature of Oracle.
- Created development, testing, staging, production for OLTP, DSS and Data Warehousing using customized shell scripts and stepwise.
- Involved in different scenarios of OLTP as well as DSS (DATA WAREHOUSE) Databases.
- Provided instance level performance monitoring and tuning for Event Waits, Sessions, Physical and Logical IO and Memory Usage.
- Administration, optimization, and maintenance of the Data Warehouse and Data Marts databases.
- Responsible for migration of PeopleSoft applications from DB2 on MVS to Oracle on AIX.
- Research and apply PeopleSoft patches and product updates posted on PeopleSoft Customer Connection and PeopleSoft Applications.
- Used Informatica Power Center for (ETL) extraction, transformation and loading data from heterogeneous source systems.
- Installed, upgraded Golden gate for Oracle database and enable real-time data integration and continuous data availability by capturing and delivering updates of critical information as the changes occur and providing continuous data synchronization across heterogeneous environments.
- Managed the migration of SQL Server 2005 databases to Oracle 11g.
- Scheduled and monitored all maintenance activities of SQL Server 2000 including database consistency check, and index defragmentation.
- Performance Tuning i.e. Tuning RAC, tuning applications, shared pool, I/O distribution, rollback segments, buffer cache, redo mechanisms
- Worked on Automatic Storage Management (ASM) for managing data in SAN.
- Created scripts and documented the Operational guide for Streams.
- Created & Managed Jobs using AutoSys Job Scheduler.
- Performed Uni-Directional, Bi- Directional Streams from HP-UX Boxes to Suse Linux Boxes.
- Extensive UNIX Shell scripting in csh, Pro*C.
Technologies: Solaris and Linux, Oracle 9i and 10g/11g, SQL Server 2000/2005, Toad, Loader, SQL*Plus, PeopleSoft, Informatica Power Centre 7.2, Erwin, Clearcase, Foglight, Autosys, SQL Backtrack, BMC Patrol, Query Analyzer, Oracle Fail Safe.
Confidential
ORACLE DBA
Responsibilities:
- Involved in design process using UML & RUP (Rational Unified Process).
- Review of Database logical and Physical design.
- Install, upgrade and maintain Oracle software products/versions for on Linux, Sun Solaris and HP-UX.
- Database monitoring and performance tuning using OEM (Oracle Enterprise Manager) and STATSPACK.
- Trouble shooting of various database performances by proper diagnosis at all levels like SQL, PL/SQL, database design, database tables, indexes, Instance, memory, operating system and java calls.
- Performed incremental migration from Oracle 8i multimaster to Oracle 9i multimaster replication environment.
- Installed RAC on Oracle 10g and worked on ASM feature of Oracle.
- Configured, migrated and tuned of Oracle 9iRAC to 10gRAC.
- Migrated database Oracle 8i from Windows NT to Red Hat Linux with minimal downtime.
- Worked with ETL tool Data Stage and Ware house tool Business Object.
- Created and configured workflows, worklets & Sessions to transport the data to target warehouse Oracle tables using Informatica Workflow Manager.
- Installed and configured Oracle 10g database on a test server using Oracle standard procedures and OFA, for performance testing and future 10g production implementation
- Data Modeling, Database Design and Creation using Erwin.
- Applying Patches 9.2.0.6 and applying Oracle DST Patches on Oracle database.
- Performing day-to-day database administration tasks like Tablespace usage, checking alert log, trace files, monitoring disk usage, Table/index analyze jobs, database backup logs etc.
- Analysis and management of Oracle problems/defects using SRs and/or Metalink with Oracle
- Corporation and installation of patches.
- Write and modify UNIX shell scripts to manage Oracle environment.
- Problem analysis involving interactions/dependencies between UNIX and Oracle.
- Used PL/SQL, SQL Loader as part of ETL process to populate the operational data store.
- Security User Management, Privileges, Roles, Auditing, Profiling, Authentication.
Technologies: Solaris 8.0/9.0/10.0, OAS 10g, Linux Red Hat, Windows 2000/2003/NT Oracle 8i (8.1.6/8.1.7 ), Oracle 9i/10gR1, Oracle Replication, Hot Standby, SQL*Plus, PL/SQL, SQL*Loader, SQL Server 2000, Toad, Erwin 7.2.
Confidential
ORACLE DBA
Responsibilities:
- Database Installations, Creations and Upgrades of 8i/9i on Windows 2000/NT
- Analyzed the structures using Analyze command.
- Involvement in all phases of the project life cycle
- Written, tested and implemented various UNIX Shell, PL/SQL and SQL scripts to monitor the pulse of the database and system
- Assuring appropriate backup and recovery procedures to be in place to meet the Company’s requirements.
- Worked closely with developers in Designing Schema / tuning PL*SQL/ SQL code
- Performed Database Tuning for Oracle 8i & 9i servers including reorganization, SGA, kernel, redo log tuning, Rollback tuning.
- Implemented automated backup strategy using the UNIX shell script CRON UTILITY
Technologies: Windows 98/NT/2000, HP-UX, Oracle 8i/7, TOAD, Database Tools/Utilities.