Hadoop Administrator Resume Boston, MA - Hire IT People

EXPERIENCE SUMMARY:

8+ years overall experience in Systems Administration, Application Administration in diverse industries.
Over 2 years of comprehensive experience as a Hadoop Administrator.
Good understanding/knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, Hive, Sqoop, Pig, HDFS, Hbase, ZooKeeper, Oozie, and Flume.
Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions: Cloudera CDH, Apache Hadoop and Hortonworks.
Well versed in installing, upgrading & managing Apache, Cloudera (CDH4) and Hortonworks distributions for Hadoop.
Experience with Installed Hadoop patches, major and minor version upgrades of Hadoop Apache, Cloudera and Hortonworks distributions.
Experience monitoring workload, job performance and collected metrics for Hadoop cluster when required using Ganglia, Nagios and Cloudera Manager.
Advanced knowledge in troubleshooting operations issues, finding root causes faced in managing Hadoop clusters.
Analyzed large data sets using Hive queries and Pig Scripts.
In - depth understanding of Data Structures and Algorithms.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
Experience in developing solutions to analyze large data sets efficiently
Experience in Data Warehousing and ETL processes.
Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper
Involved in Performance tuning activity in different levels. For optimizing Map Reduce programs which run on large datasets, that are written once, executed over and over.
Experience in designing and building a Highly Available (HA) and Fault Tolerant (FT) distributed systems.
Experience in developing ETL process using Hive, Pig, Sqoop and Map-Reduce Framework.
Experience in troubleshooting, finding root causes, Debugging and automating solutions for operational issues in the production environment
Ability to learn and adapt quickly and to correctly apply new tools and technology.
Extensive experience in creating Class Diagrams, Activity Diagrams, Sequence Diagrams using Unified Modeling Language(UML)
Co-ordination with software development team, project implementation, analysis, technical support, data Conversion and deployment.
Worked extensively on Hardware capacity management and procurement process.
Skilled technologist with deep expertise in aligning solutions against strategic roadmaps to support business goals.
Expert communicator, influencer, and motivator who speaks the language of business and technology, resulting in excellent rapport with fellow employees, peers and executive management.
Good problem-solving skills, quick learner, effective individual and team player, excellent communication and presentation skills.

TECHNICAL SKILLS:

Hadoop: HDFS, YARN, Map-Reduce, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper and Data meter

Databases: Oracle, PARAccel and Microsoft SQL server, Cassandra

Markup Languages: HTML, CSS, DHTML, and XML.

Application Servers: Apache Tomcat server, Apache HTTP webserver

Operating Systems: Redhat Linux 5.8(Tikanga), 6.3(Santiago), UNIX, Windows.

Scripting Lang: Shell, Java Script

Languages: C, SQL, PLSQL, Java, PHP.

PROFESSIONAL EXPERIENCE:

Confidential, Boston, MA

Hadoop Administrator

Responsibilities:

Worked with various Networking, SA, Security teams during design and implementation phase of Big Data Clusters.
Designed and implemented Unix level security model for Big Data Cluster.
Designed and implemented HDFS level security.
Successfully performed installation of CDH4 - Cloudera’s Distribution including Apache Hadoop through Cloudera manager.
Successfully performed installation of CDH3 - Cloudera’s Distribution including Apache Hadoop through Cloudera manager.
Successfully performed deployment of Hadoop ecosystem.
Successfully performed installation of Analytics tool Datameer.
Performed Hadoop clusters administration through Cloudera Manager.
Worked on Hadoop clusters capacity planning and management.
Worked on Hadoop clusters Filesystem planning and management.
Worked with SA/security team to enable Kerberos on Big Data Cluster, create and deploy Kerberos principals and keytab files.
Worked on HUE security setup, integrated HUE with PAM backend for authentication/authorization.
Worked on HUE integration with LDAP for authentication/authorization.
Worked on Cloudera manager integration with LDAP for authentication/authorization.
Worked on Datameer integration with LDAP for authentication/authorization.
Worked on Audit setup for activity on HDFS, entitlement reviews.
Worked on monitoring setup for Hadoop ecosystem components.
Performed Hadoop cluster monitoring and troubleshooting.
Proficient in Map-Reduce, shell scripting.
Worked on establishing Operational/Governance model and Change Control Board for various lines of business running on Big Data Clusters.

Confidential, New York NY

Hadoop Big Data Engineer

Responsibilities:

Installed and Setup Hadoop HDP 1.3 clusters for development and production environment.
Installed and configured Hive, Pig, Sqoop, Flume, Ambari and Oozie on the Hadoop cluster
Planning for production cluster hardware and software installation on production cluster and communicating with multiple teams to get it done.
Migrated data from Hadoop to AWS S3 bucket using DISTCP. Also migrated data across new and old clusters using DISTCP.
Implemented HDFS snapshot feature
Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and collected metrics for Hadoop cluster when required.
Installed Hadoop patches, updates and version upgrades when required
Installed and configured Ambari, Hive, Pig, Sqoop and Oozie on the HDP 2.0 cluster.
Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
Performed a Major upgrade in development environment from HDP 1.3 to HDP 2.0.
Worked with big data developers, designers and scientists in troubleshooting map reduce, hive jobs and tuned them to give high performance.
Design & Develop ETL workflow using Oozie for business requirements which includes automating the extraction of data from MySQL database into HDFS using Sqoop scripts.
Automated end to end workflow from Data preparation to presentation layer for Artist Dashboard project using Shell Scripting.
Developed Map reduce program were used to extract and transform the data sets and result dataset were loaded to Cassandra.
Created Hive Tables, loaded WWS data (worldwide sales) transactional data from Oracle using Sqoop and loaded the processed data into ParAccel database.
Orchestrated Sqoop scripts, pig scripts, hive queries using Oozie workflows and sub-workflows
Validating the data using the swagger API URL by providing the key for a corresponding column family.
Used Maven for continuous build integration and deployment.
Conducting RCA to find out data issues and resolve production problems.
Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
Performed data analytics in Hive and then exported this metrics back to Oracle Database using Sqoop.
Involved in Minor and Major Release work activities.
Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
Collaborating with business users/product owners/developers to contribute to the analysis of functional requirements.

Technologies: Horton works Distributions (HDP1.3 & HDP 2.0), Hive, Pig Latin, Ambari, Nagios, Ganglia, Cassandra, Tableau and ParAccel.

Confidential, Atlanta, GA

Hadoop Big Data Engineer

Responsibilities:

Worked with technology and business groups for Hadoop migration strategy.
Research and recommend suitable technology stack for Hadoop migration considering current enterprise architecture.
Installed and configured various components of Hadoop ecosystem and maintained their integrity.
Validated and Recommended on Hadoop Infrastructure and data center planning considering data growth.
Worked with Linux server admin team in administering the server hardware and operating system.
Monitored Hadoop cluster, workload, job performance environments using Data dog and Cloudera Manager.
Transferred data to and from cluster, using Sqoop and various storage media such as Informix tables and flat files.
Developed MapReduce programs and Hive queries to analyze sales pattern and customer satisfaction index over the data present in various relational database tables.
Worked extensively in performance optimization by adopting/deriving at appropriate design patterns of the MapReduce jobs by analyzing the I/O latency, map time, combiner time, reduce time etc.
Developed Pig scripts in the areas where extensive coding needs to be reduced.
Performed data analytics in Hive and then exported this metrics back to Oracle Database using Sqoop.
Perform maintenance, monitoring, deployments, and upgrades across infrastructure that supports all our Hadoop clusters.
Implemented Partitioning, Dynamic Partitions, Buckets in Hive
Debugging and troubleshooting the issues in development and Test environments.
Conducting root cause analysis and resolve production problems and data issues.
Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
Document and manage failure/recovery (loss of name node, loss of data node, replacement of HW or node)
Followed agile methodology for the entire project.

Technologies: Hive, Cloudera Hadoop Distribution (CDH4), Cloudera Manager, Puppet, Ganglia and Nagios.

Confidential, San Diego

Sr. Oracle DBA

Responsibilities:

Installed Oracle 9i and 10g, RAC database on HP-UX 11.0, Redhat4
Installed RAC on Oracle 10g and worked on ASM feature of Oracle.
Created development, testing, staging, production for OLTP, DSS and Data Warehousing using customized shell scripts and stepwise.
Involved in different scenarios of OLTP as well as DSS (DATA WAREHOUSE) Databases.
Provided instance level performance monitoring and tuning for Event Waits, Sessions, Physical and Logical IO and Memory Usage.
Administration, optimization, and maintenance of the Data Warehouse and Data Marts databases.
Responsible for migration of PeopleSoft applications from DB2 on MVS to Oracle on AIX.
Research and apply PeopleSoft patches and product updates posted on PeopleSoft Customer Connection and PeopleSoft Applications.
Used Informatica Power Center for (ETL) extraction, transformation and loading data from heterogeneous source systems.
Installed, upgraded Golden gate for Oracle database and enable real-time data integration and continuous data availability by capturing and delivering updates of critical information as the changes occur and providing continuous data synchronization across heterogeneous environments.
Managed the migration of SQL Server 2005 databases to Oracle 11g.
Scheduled and monitored all maintenance activities of SQL Server 2000 including database consistency check, and index defragmentation.
Performance Tuning i.e. Tuning RAC, tuning applications, shared pool, I/O distribution, rollback segments, buffer cache, redo mechanisms
Worked on Automatic Storage Management (ASM) for managing data in SAN.
Created scripts and documented the Operational guide for Streams.
Created & Managed Jobs using AutoSys Job Scheduler.
Performed Uni-Directional, Bi- Directional Streams from HP-UX Boxes to Suse Linux Boxes.
Extensive UNIX Shell scripting in csh, Pro*C.

Technologies: Solaris and Linux, Oracle 9i and 10g/11g, SQL Server 2000/2005, Toad, Loader, SQL*Plus, PeopleSoft, Informatica Power Centre 7.2, Erwin, Clearcase, Foglight, Autosys, SQL Backtrack, BMC Patrol, Query Analyzer, Oracle Fail Safe.

Confidential

ORACLE DBA

Responsibilities:

Involved in design process using UML & RUP (Rational Unified Process).
Review of Database logical and Physical design.
Install, upgrade and maintain Oracle software products/versions for on Linux, Sun Solaris and HP-UX.
Database monitoring and performance tuning using OEM (Oracle Enterprise Manager) and STATSPACK.
Trouble shooting of various database performances by proper diagnosis at all levels like SQL, PL/SQL, database design, database tables, indexes, Instance, memory, operating system and java calls.
Performed incremental migration from Oracle 8i multimaster to Oracle 9i multimaster replication environment.
Installed RAC on Oracle 10g and worked on ASM feature of Oracle.
Configured, migrated and tuned of Oracle 9iRAC to 10gRAC.
Migrated database Oracle 8i from Windows NT to Red Hat Linux with minimal downtime.
Worked with ETL tool Data Stage and Ware house tool Business Object.
Created and configured workflows, worklets & Sessions to transport the data to target warehouse Oracle tables using Informatica Workflow Manager.
Installed and configured Oracle 10g database on a test server using Oracle standard procedures and OFA, for performance testing and future 10g production implementation
Data Modeling, Database Design and Creation using Erwin.
Applying Patches 9.2.0.6 and applying Oracle DST Patches on Oracle database.
Performing day-to-day database administration tasks like Tablespace usage, checking alert log, trace files, monitoring disk usage, Table/index analyze jobs, database backup logs etc.
Analysis and management of Oracle problems/defects using SRs and/or Metalink with Oracle
Corporation and installation of patches.
Write and modify UNIX shell scripts to manage Oracle environment.
Problem analysis involving interactions/dependencies between UNIX and Oracle.
Used PL/SQL, SQL Loader as part of ETL process to populate the operational data store.
Security User Management, Privileges, Roles, Auditing, Profiling, Authentication.

Technologies: Solaris 8.0/9.0/10.0, OAS 10g, Linux Red Hat, Windows 2000/2003/NT Oracle 8i (8.1.6/8.1.7 ), Oracle 9i/10gR1, Oracle Replication, Hot Standby, SQL*Plus, PL/SQL, SQL*Loader, SQL Server 2000, Toad, Erwin 7.2.

Confidential

ORACLE DBA

Responsibilities:

Database Installations, Creations and Upgrades of 8i/9i on Windows 2000/NT
Analyzed the structures using Analyze command.
Involvement in all phases of the project life cycle
Written, tested and implemented various UNIX Shell, PL/SQL and SQL scripts to monitor the pulse of the database and system
Assuring appropriate backup and recovery procedures to be in place to meet the Company’s requirements.
Worked closely with developers in Designing Schema / tuning PL*SQL/ SQL code
Performed Database Tuning for Oracle 8i & 9i servers including reorganization, SGA, kernel, redo log tuning, Rollback tuning.
Implemented automated backup strategy using the UNIX shell script CRON UTILITY

Technologies: Windows 98/NT/2000, HP-UX, Oracle 8i/7, TOAD, Database Tools/Utilities.

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Boston, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship