Hadoop Developer Resume
MN
SUMMARY
- Around 8 years of IT experience as ETL and Hadoop Developer in the implementation of the Data ware housing project.
- Used ETL methodology for Designing jobs that extract transform data Create, manage, and reuse metadata and run, and schedule jobs using Data Stage.
- Functional and unit testing of programs/applications.
- Strong working experience on Data Stage technologies 8.5 (Data Stage Manager, Data Stage Designer, Data Stage Director).
- Extracting data from sources like DB2, Teradata, Oracle, Sequential file and Complex flat file using stages.
- Expert in using Parallel Extender, developing Parallel jobs using stages, which includes Aggregator, Join, Transformer, Sort, Merge, Filter, Lookup and profound knowledge of the SCD stages.
- Excellent experience in loading and maintaining Data Warehouses and Data marts using DataStage ETL processes.
- Good knowledge in DB2, Teradata and UNIX.
- Knowledge in designing Star Schema and Snowflake Schema in implementing Decision Support Systems.
- Performed performance tuning of the Jobs (ETL/ELT) by interpreting performance statistics of the jobs developed.
- Good Knowledge on the Bigdata, Hive, Sqoop, Pig, YARN, Python and TDCH export.
- Worked on the Sqoop, Hive, YARN, HDFS file system for Hadoop archival from Teradata.
- Created and worked Sqoop (version 1.4.6) jobs with incremental load to populate Hive External and Managed tables.
- Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
- Worked on Talend Jobs using Various Stages like tHdfs,tHive,tFileInput, tAggregate, tConvert, tSort, tFilter, tMap, tJoin, tReplace and Different Databases.
- Experience in using Sequence files, RCFile and ORC file formats.
- Created Hive (version 1.2.1) scripts to build the foundation tables by joining multiple tables.
- Extensive experience in writing Pig (version 0.15) scripts to build the consumable Foundation.
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
- Developed Oozie (version 4.2) workflow for scheduling and orchestrating the ETL Process.
- Have more than 1 year of onsite Experience from Requirement gathering till Implementation.
- Good understanding of Agile practices with experience in working agile model.
TECHNICAL SKILLS
BI Tool: Data stage 7.5.3, DataStage 8.5, (Manager, Director, and Designer), Talend, Control - M
Database: Teradata, DB2, Oracle.
Hadoop Technologies: Yarn, MapReduce, Hive Data warehouse, Sqoop, TDCH, HBase, Pig, oozie
Programming Languages: UNIX, AIX 6.4, Sql, Hive QL
Tools: Quality Centre, Service-Now, HPSD, CDM, Team Prise, Erwin Data Modeler, Version One, Jira, Git
PROFESSIONAL EXPERIENCE
HADOOP Developer
Confidential, MN
Responsibilities:
- Created the Sqoop script to bring the FF ODS Oracle data to HDFS.
- Created the Sqoop Script to bring the Reference data from Teradata to HDFS.
- Created the Talend jobs to Read data from the HDFS using tHdfs and load into Hive tables.
- Sourced XML files from external vendors using IBM Message Queues and landed in Edge node.
- Created the Hive External and Managed table on top of HDFS.
- Designed and developed the foundation tables for Guest Order life cycle.
- Created Hive QL to build the foundation tables by joining multiple Hive tables.
- Used Pig Latin language for building the foundation tables for guest order life cycle.
HADOOP Developer
Confidential, Minneapolis, MN
Responsibilities:
- Created Sqoop Framework to bring the data from Teradata and load in HDFS.
- Created the Hive External and Managed table with Partition and cluster concept.
- Created the Hive Managed table with compressed ORC file format.
- Created the Oozie framework to run the data archival scheduled to run Daily.
- Created the Framework to bring the data from Hadoop HDFS to Teradata using Teradata Utility TDCH export.
- Good understanding of Agile practices with experience in working agile model.
ETL Developer
Confidential, Minneapolis, MN
Responsibilities:
- Interacting with Business to analyze requirements and providing technical designs to ensure deliverables are of high quality.
- Converting Business Functional Specifications to Technical Design.
- Optimized the source to Teradata landing load reducing significant load and run time by creating a new framework to cater to a requirement of fetching data from 40-50 tables from 30 distribution Center containing around 15GB data per day.
- Created Hive External and Managed Table to store the Data in Hadoop file system.
- Created Sqoop script which fetch the data from Teradata and Load into Hive tables.
- Development of application modules based on new requirements individually and/or in teams. Giving technical sessions to new developers joining the project.
- Coordination and collaboration with multiple teams in all phases of the project. Communicating the project status to all Stakeholders.
- Coordinated with Teradata DBA’s for Performance Tuning of SQL’s.
- Used the Secure file transfer protocol (SFTP) to connect to external Servers to get the files.
- Perform Impact analysis of the change requests on schedule, effort.
- Leading and completely owning the Implementation from Minneapolis.
- Designed and Developed Teradata Sql’s for the Type 1 and Type 2 tables.
- Set Control-M scheduling for all the jobs and Modules.
- Effectively Used Git to maintain the code Version control.
- Analyzed the data model and participated in the data modeling session passed on my comments to modeler.
ETL Developer
Confidential
Responsibilities:
- Developed the Teradata ELT SQL to load Type1 and Type2 Foundation tables.
- Used Data stage 8.5 to fetch the data from oracle source to Load Teradata landing tables.
- Created the WO’s, Implementation plan, P2P, I profile document for smooth Implementation.
- Coordinated with BDQ and testing team for understating the defect and the defect closure.
- Over 3 Months of Onsite Experience for Implementation of the Project.
- Coordinated with testing team to review the testing strategy and test cases and also ensured maximum number of scenarios are covered.
- Set up the Control-M scheduling for all the Jobs.
- Conducted knowledge sharing sessions with testing and data quality teams.
ETL Developer
Confidential
Responsibilities:
- Done the Extensive Data profiling on the oracle Source systems to identify the source tables.
- Traveled to Onsite (Minneapolis) for the Requirement gathering by discussing with Onshore BPC.
- Based on the Business requirements designed the Low Level Design Documents.
- Developed the Teradata ELT SQL to load Type1 and Type2 Foundation tables.
- Performed the unit testing and Component integration testing on the job.
- Coordinated with BDQ and testing team for understating the defect and the defect closure.
- Provided support all Implementation activities.
ETL Developer
Confidential
Responsibilities:
- Done the Extensive Data profiling on the oracle Source systems to identify the source tables.
- Based on the Business requirements designed the Low Level Design Document for Picking Subject area.
- Prepared the source to Confidential Mapping, validation of landing tables and unit testing of the code.
- Developed the Teradata ELT SQL to load Type1 and Type2 Foundation tables.
- Performed the unit testing and Component integration testing on the job.
- Set up the Control-M scheduling for all the jobs.
- Created the Change request, RITM, Implementation documents for smooth implementation.
- Coordinated with BDQ and testing team for understating the defect and the defect closure.
ETL Developer
Confidential
Responsibilities:
- Requirement gathering and analysis.
- Designed the parallel data stage jobs to load the Landing and Foundation tables using Different Components.
- Performed the unit testing and Component integration testing on the job.
- Coordinated with BDQ and testing team for understating the defect and the defect closure.
- Provided support for all implementation activities.
- Perform all aspects of software quality assurance/testing applications.
- Interact with different support groups.
ETL Developer
Confidential
Responsibilities:
- Analyzing the existing the application
- Removing the Hyper roll cube structure by rewriting it by DB2 SQl..
- Creating the parallel data stage jobs to load the Aggregate tables.
- Performed the unit testing and Component integration testing on the job.
- Coordinated with BDQ and testing team for understating the defect and the defect closure.
- Creating unit testing and system testing scripts for change requests
- Testing and documenting the results.
- Provided support all implementation activities
ETL Developer
Confidential
Responsibilities:
- Analysis of the job for the performance tuning.
- Converted the ETL Jobs into ELT for better performance of the jobs.
- Worked with back end Database Administrators to provide requirements for necessary back end tables.
- Performed the unit testing and Component integration testing on the job.
- Coordinated with BDQ and testing team for understating the defect and the defect closure.
- Testing and documenting the results.
- Involved in the warranty support team in resolving day to day issues if any.
ETL Developer
Confidential
Responsibilities:
- Requirement gathering and analysis.
- Done the Extensive Data profiling on the Legacy systems to identify the source tables for the SDM.
- Based on the Business requirements developed Data Stage jobs all the modules (SDM/SDS/WDS) for transformation of data.
- Analyzed, designed, developed, implemented and loaded the history and incremental load to foundation tables.
- Developed jobs in parallel Extender using different stages like Join, Lookup stage, Copy stage, Row generation, Column generation, and Funnel Stages.
- Developed the tables with Type I and Type II slowly changing dimension tables from several mainframe flat files and tables.
- Performing unit testing and Component integration testing on foundation tables.
- Tested the accuracy and correctness of data and documented the possible improvements and transition activity.
- Coordinated with BDQ and testing team for understating the defect and the defect closure.
