Datastage Sme/ Developer(datastage/netezza) Resume
Coppell, TexaS
SUMMARY
- Dynamic, result oriented Information Server/DataStage SME/Architect/Developer with 13 years of experience in implementing full life cycle data warehousing projects
- Strong knowledge of Star schema and Snow flake schemas, involved in designing DW concepts like conformed dimensions, conformed facts and Data Warehouse bus architecture for ETL development
- Strong working experience in the Data Analysis, Design, Development, Implementation and Testing of Data Warehousing projects
- Experience in being in constant interaction with Data Modelers and Architects, and involved in the process of giving and taking feedback as a part of improving the Data Model so that the Data Model supports the structures needed for ETL development
- 13 plus years of experience and an expert in IS /DataStage Enterprise Edition installations on all flavors on UNIX platforms, Windows and Linux Grid.
- Experience in designing the DataStage Master Design Document for the project and Detail Design Document for a release to guide ETL (Extract, Transform and Load) development
- Expert in establishing DataStage patterns for the project/release in order to allow maximum reusability of the DataStage modules and processes
- Expert knowledge in DataStage best practices and in designing jobs for optimal performance, reusability and restartability
- Expert in configuring database connections for Oracle, DB2, Sybase and SQL Server from DataStage
- Expert in setting up parallel configuration files depending upon Client’s hardware architecture
- Performing routine daily functions of a DataStage Admin/ install expert
- Strong experience in making DataStage processes as repeatable as possible to ensure there are significantly less man hours used, less defect ratio, and ability to deliver a task in less time; and as a result be more cost effective
- Strong and proven experience in making DataStage ETL processes function as a well oiled machine with lesser development time required as the project progresses
- Strong working experience in mentoring, training, guiding and leading teams of three to four DataStage developers, experienced and beginners both onsite and offsite
- Data Warehousing implementation experience using DataStage PX (DataStage Designer, DataStage Director, DataStage Manager, DataStage Administrator), DataStage Server
- Expertise in implementing DataStage PX, Sequencer, Server, Shared Container jobs and Build - ops
- Strong experience in coding using SQL, SQL*Plus, PL/SQL Procedures/Functions, Triggers and Packages
- Strong experience in Unix scripting and running DataStage jobs using Unix scripts
- Strong working experience in designing DataStage job scheduling outside the DataStage tool and also within the tool as required by Client/Customer company standards
- Expert in configuring and setting up DataStage PX on AIX, DB2 UDB, Teradata, Oracle and other environments
- Expert in maintaining Metadata at the database level using a combination of DS PX jobs
- Expert in designing the DataStage configuration files for optimal performance and best resource usage
- Expertise in Integration of various data sources like DB2 UDB, Oracle 10g,9x/8.x/7.x, SQL Server, MS Access, Teradata, and Sybase
- Experience in creating required documentation for production support hand off, and training production support on commonly encountered problems, possible causes for problems and ways to debug them
- Strong experience in Object Oriented Concepts, C, C++ and Java
- Strong experience in writing server routines in Basic and Parallel routines in C++ for custom logic and leveraging the same in DataStage jobs.
- Received Advanced DataStage training from Confidential and also taught several DataStage classes
- Strong experience in Investigate, Standardize, Match and Survivorship operations
- Strong experience in integrating the QualityStage cleansing operations within a DataStage job before the actual ETL
- Strong experience in Name and Address standardization using the USNAME, USADDR and USAREA rule sets
- Experience in pattern overrides for non standard/ unrecognized patterns
- Experience in designing Match specifications using the Match designer in QualityStage
- Experience in designing the Survive rules for the Confidential output fields
TECHNICAL SKILLS
- InfoSphere Information Server DataStage 11.3/9.x/8.x
- WebSphere Information Server DataStage 8.0.1
- QualityStage 8.0.1
- Information Analyzer 8.0.1 DataStage EE 7.5.1 (DataStage Designer
- DataStage Director
- DataStage Manager
- DataStage Administrator)
- DataStage Server
- ProfileStage
- QualityStage
- MetaStage
- Java
- C
- C++
- SQL
- PL/SQL
- COBOL
- UML
- FORTRAN
- Perl UNIX (AIX
- SOLARIS
- HP-UX)
- Linux
- MS-DOS
- Windows 9x/NT/2K/XP
- E-R Modeling
- Dimensional Modeling
- ERWin
- JSP
- Servlets
- Java Servlets
- EJB
- HTML
- XML
- JavaScript
- VB Script and UNIX Shell Scripting
- DB2
- Oracle
- SQL Server
- Teradata
- MS Access
- ASP.NET
- VB.NET
- MS OFFICE
- VB 5.0/6.0
- VC++
- FoxPro
- Netezza 7.x.
PROFESSIONAL EXPERIENCE
Confidential, Coppell, Texas
DataStage SME/ Developer(DataStage/Netezza)
Responsibilities:- Involved in designing DataStage Mapping and the Technical Documentation.
- Involved in creating jobs and analyzing scope of application, defining relationship within and between groups of data, star schema, etc.
- Used DataStage Designer to develop processes for extracting, cleansing, transforming, integrating and loading data into Data Warehouse database.
- Documented user requirements, translated requirements into system solutions and develop implementation plan and schedule.
- Identified and documented Data Sources and Transformation Rules required populating and maintaining data warehouse.
- Created DataStage jobs using different stages like Transformer, Aggregator, Sort, Join, Merge, Lookup, Data Set, Funnel, Remove Duplicates, Copy, Modify, Filter, Change Data Capture, Change Apply, Surrogate Key, Column Generator, and Row Generator.
- Train and manage developers, advising other groups in organization on SSIS development, data warehouse development, and ETL development best practices.
- Installed Confidential Information Server 9.1.2 on AIX with XMETA database on DB2
- Configured databases connections to Oracle, Netezza and SQL server for DataStage
- Created userids, granted access to Information Server suite modules based on roles and responsibilities
- Performed daily tasks of DataStage Administrator such as creating projects, doing backups, debugging aborted jobs, giving expert advice to developers on how to improve performance of their jobs
- Installed patches from Confidential as and when they became available
Environment: Confidential InfoSphere Information Server DataStage 9.1.2, AIX, SQL Server, DB2, ORACLE, Netezza, SQL, Windows, UNIX Shell Scripts, MS Office
Confidential, Dallas, Texas
DataStage SME/ Architect
Responsibilities:- Installed Information Server 11.3 on AIX with DB2 as xmeta repository.
- Upgraded 8.5 to 9.1.2 with DB2 as xmeta repository and configured Netezza, SQL Server and Oracle connections.
- Installed 8.5 Fix Pack 1, Fix Pack 2 for Information Server
- Installed 8.5 with DB2 as XMETA repository and configured Oracle, SQL Server and DB2 connections.
- Performed all and every server related daily admin activities like designing configuration files, user creation, logs purging and so on.
- Designed and architected the entire ETL infrastructure/architecture for 8.5/ DB2 and 9.1.2/ Netezza.
- Architected and developed the conformed ETL architecture for loading Confidential atomic Data Warehouse and Data Marts for Healthcare
- Architected and developed the operational meta data schema for capturing operational job run meta data and interface runs.
- Architected and developed the auditing and reject handling capabilities.
- Worked in capacity of Data Architect for the last one year to bring in new subject areas in to the Confidential atomic Data Warehouse
- Worked as Data Architect and designed dimensional structures needed for Pharmacy reporting.
- Worked on design template jobs to load staging, Confidential atomic data warehouse for Healthcare and data mart on DB2 using InfoSphere 8.5/9.1.2/11.3
- Developed DataStage ETL jobs to load foundation data like Patient, Encounter, Point Of Care location, Organization.
- Developed DataStage ETL jobs to load clinical data like Medications, Clinical orders and findings and Groupers
- Designed and developed ETL processes to support Pharmacy reporting with subject areas like GL, Interventions and Orders.
- Participate in all stages of the development life cycle including requirement analysis, design, development and implementation of ETL jobs to load raw staging, conformed staging, EDW layer and datamart.
- Worked extensively on Datastage 8.5 and also on 9.1.2/11.3 migration
- Worked on conversion of datastage jobs with DB2 database connections to the ones with Netezza database using Confidential RJUT (Rapid Job Update Tool). Also worked on updating Source/ Confidential SQL as necessary.
- Fine Tune Jobs/Process to higher performances & debug critical/complex jobs.
- Architected design and code reviews and extensive documentation of standards, best practices, and ETL procedures.
- Extensively worked with DDL, DML, and DCL on databases to support ETL processes.
- Ensure issues are identified, tracked, reported on and resolved in a timely manner.
Environment: Confidential InfoSphere Information Server DataStage 9.1, 8.5, AIX, SQL Server, DB2, ORACLE, Netezza, SQL, Windows, UNIX Shell Scripts, MS Office
Confidential, Dearborn, MI
InfoSphere DataStage Architect/ Developer/ Administrator
Responsibilities:- Designed and mentored on designing DataStage jobs for loading prestaging, staging and data mart with performance as a key factor
- Designed and developed DataStage Shared Container/RCP enabled jobs for all the common modules to allow reusability of code
- Designed and developed DataStage jobs to allow re-startability in other words if the PX job flow aborts, the PX jobs were developed in such a way as to just restart the job that failed instead restarting the whole job flow
- Implemented sequences with restartablity to run DataStage jobs
- Designed prestaging, staging and Data Mart schemas along with Data Modeler giving ETL inputs
- Reviewed and approved code changes that are implemented as a part of releases or support activities.
- Installed InfoSphere Information Server 8.5 on Windows 2008 R2 on Development, UAT and Production
Environment: Confidential InfoSphere Information Server DataStage 8.5, Confidential Identity Insight, AIX, SQL Server, DB2, SQL, Windows, UNIX Shell Scripts, MS Office
Confidential, Livingston, NJ
DataStage Architect/Developer
Responsibilities:- Trained and mentored CIT’s employees on DataStage/QualityStage tool and best practices/patterns in ETL development using DataStage and QualityStage
- Worked as a lead in designing the entire ETL architecture for Trade Finance
- Extensive experience with SQL Server, Oracle and DB2.
- Investigated the data using character discrete, character concatenate and Word investigation techniques.
- Standardized names and addresses using Confidential QualityStage.
- Extensive experience with real time address validation using plug in such as CASS, MNS and WAVES
- Extensively involved in developing data cleansing procedures using QualityStage to standardize names, addresses and area.
- Created pattern and data overrides to override the data values that are not handled by the rule sets.
- Validated Phone numbers and email addresses using Confidential QualityStage.
- Developed Match Specification for Implementing Unduplicate/Reference Match to identify Matches and Duplicates in the Source data.
- Designed and developed jobs using DataStage/QualityStage 8.x for loading the data into Dimension and Fact Tables.
- Wrote a generic Unix scripts for executing DataStage jobs through Control M
- Wrote Stored procedures and executed from DataStage
- Designed and fine tuned DataStage PX jobs with performance as a key factor
- Designed and developed DataStage Shared Container jobs for all the common modules (e.g. generation of surrogate key) to allow reusability of code
- Designed and developed DataStage PX jobs to allow re-startability in other words if the PX job flow aborts, the PX jobs were developed in such a way as to just restart the job that failed instead restarting the whole job flow
- Developed build-ops using C/C++ to simplify and reuse business logic
- Extensively used Business Objects to develop canned reports and adhoc reports
- Established best practices for DataStage PX jobs to ensure optimal performance, reusability, and re-startability before development effort started
- Fully designed and developed purge process for the staging area fact and dimension tables using combination of DataStage PX jobs and Unix scripts
- Designed and developed common modules for error checking (e.g. to check if the reject records output file is empty and to check if there are duplicate natural keys in a given table)
Environment: Confidential InfoSphere Information Server DataStage 8.x, Ascential DataStage EE 7.5.1, Linux, ORACLE, SQL, Windows, UNIX Shell Scripts, MS Office
InfoSphere Information Analyzer/Business Glossary/Meta data Work Bench
Confidential,Westlake Village,CA
Responsibilities:- Installed InfoSphere Information Server 8.5 on Amazon Cloud service on Development, UAT and Production
- Configured Parallel configuration files based on Client’s hardware configuration
- Configured databases connections to Oracle, Sybase and SQL server for DataStage
- Performed tasks of DataStage Administrator such as creating projects, doing backups, debugging aborted jobs, giving expert advice to developers on how to improve performance of their jobs
- Designed processes to load staging area with deltas on every load
- Designed and mentored on designing DataStage jobs for loading prestaging, staging and data mart with performance as a key factor
- Designed and developed DataStage Shared Container jobs for all the common modules to allow reusability of code
- Designed and developed DataStage jobs to allow re-startability in other words if the PX job flow aborts, the PX jobs were developed in such a way as to just restart the job that failed instead restarting the whole job flow
- Implemented sequences with restartablity to run DataStage jobs
- Designed prestaging, staging and Data Mart schemas along with Data Modeler giving ETL inputs
- Reviewed and approved code changes that are implemented as a part of releases or support activities.
Confidential, Dallas, TX
Responsibilities:- Designed Information Analyzer processes to analyze the source data
- Designed Meta data Workbench processes for Data Lineage, Impact Lineage and Business Lineage
- Designed Business Glossary Terms and Categories
- Designed processes to load staging area with deltas on every load
- Designed and mentored on designing DataStage jobs for loading prestaging, staging and data mart with performance as a key factor
- Designed and developed DataStage Shared Container jobs for all the common modules to allow reusability of code
- Designed and developed DataStage jobs to allow re-startability in other words if the PX job flow aborts, the PX jobs were developed in such a way as to just restart the job that failed instead restarting the whole job flow
- Implemented sequences with restartablity to run DataStage jobs
- Designed prestaging, staging and Data Mart schemas along with Data Modeler giving ETL inputs
- Reviewed and approved code changes that are implemented as a part of releases or support activities.
Confidential, Mount Laurel, NJ
DataStage Architect/ Developer
Responsibilities:- Designed datawarehouse processes for loading the data into Staging (S1), Warehouse(S2) and datamarts.
- Designed processes to load staging area with deltas on every load
- Designed and mentored on designing DataStage jobs for loading prestaging, staging and data mart with performance as a key factor
- Designed and developed DataStage Shared Container jobs for all the common modules to allow reusability of code
- Designed and developed DataStage jobs to allow re-startability in other words if the PX job flow aborts, the PX jobs were developed in such a way as to just restart the job that failed instead restarting the whole job flow
- Implemented sequences with restartablity to run DataStage jobs
- Designed prestaging, staging and Data Mart schemas along with Data Modeler giving ETL inputs
- Reviewed and approved code changes that are implemented as a part of releases or support activities.
- Designed and developed common modules for error checking (e.g. to check if the reject records output file is empty and to check if there are duplicate natural keys in a given table)
- Used QualityStage to consolidate customer data in to customer dimension
- Reviewed and approved code changes that are implemented as a part of releases or support activities.
- Designed and developed common modules for error checking (e.g. to check if the reject records output file is empty and to check if there are duplicate natural keys in a given table)
- Installed Confidential Information Server 8.0.1/8.1 on SUSE Linux with XMETA database on Oracle 10g
- Configured Parallel configuration files based on Client’s hardware configuration
- Configured databases connections to Oracle, Sybase and SQL server for DataStage
- Created userids, granted access to Information Server suite modules based on roles and responsibilities
- Performed daily tasks of DataStage Administrator such as creating projects, doing backups, debugging aborted jobs, giving expert advice to developers on how to improve performance of their jobs
- Installed patches from Confidential as and when they became available
- Worked as Production Support Specialist/Primary Development contact for over-night shifts, providing support to job failures, master sequence failures and addressing any data restoring efforts
Environment: Confidential WebSphere Information Server DataStage 8.x, Ascential DataStage EE 7.5.1, Linux, VSS, C/C++, ORACLE, SQL, Erwin, Windows, UNIX Shell Scripts, MS Office
Confidential, Austin, TX
Confidential DataStage Architect/ Developer
Responsibilities:- Installed Confidential Information Server on Windows 2003 with XMETA database on Oracle 10g
- Configured Parallel configuration files based on Client’s hardware configuration
- Configured databases connections to Oracle, Sybase and SQL server for DataStage
- Created userids, granted access to Information Server suite modules based on roles and responsibilities
- Installed patches from Confidential as and when they became available
- Performed daily tasks of DataStage Administrator such as creating projects, doing backups, debugging aborted jobs, giving expert advice to developers on how to improve performance of their jobs
- Designed and developed jobs using Information Server Datastage/Qualitystage 8.0.1 for loading the data into prestaging, staging and datamart.
- Designed and fine tuned DataStage jobs for loading prestaging, staging and data mart with performance as a key factor
- Designed and developed DataStage Shared Container jobs for all the common modules to allow reusability of code
- Designed and developed DataStage jobs to allow re-startability in other words if the PX job flow aborts, the PX jobs were developed in such a way as to just restart the job that failed instead restarting the whole job flow
- Implemented sequences with restartablity to run DataStage jobs
- Designed prestaging, staging and Data Mart schemas along with Data Modeler giving ETL inputs
- Implemented the new SCD stage to design a Type 2 dimension using DataStage
- Used Complex Queries on DB2 and Oracle for data extraction from Source and Staging
- Created PL/SQL procedures to implement complex business logic and calculations for loading the Fact Tables
- Investigated the data using character discrete, character concatenate and Word investigation techniques.
- Used Quality Stage to standardize US names and US addresses to load in Customer dimension
- Responsible for preparing comprehensive test plans and thorough testing
of the system keeping the business users involved in the UAT.
- Reviewed and approved code changes that are implemented as a part of releases or support activities.
- Designed and developed common modules for error checking (e.g. to check if the reject records output file is empty and to check if there are duplicate natural keys in a given table)
- Helped the Client with Information Server installations and issues with SCD operators. Installed Rollup patches and Fix Packs from Confidential and supported the same.
Environment: Confidential WebSphere Information Server DataStage 8.x, Ascential DataStage EE 7.5.1, C/C++, ORACLE, Windows 2003, SQL, Erwin, Windows, UNIX Shell Scripts, MS Office
Confidential, Middletown, NJ
DataStage Architect/DataStage SME / Developer
Responsibilities:- Brought in as WebSphere Information Server DataStage 8.x and QualityStage expert to do a Proof Of Concept (POC) for ongoing projects which involves reading XML input files along with other traditional data sources
- Developed jobs in WebSphere Information Server DataStage 8.x to load the Data Warehouse staging area and Data Marts
- Used the new WebSphere Information Server range lookup feature extensively to validate the operational data
- Used the enhanced Surrogate Key and new Slowly Changing Dimension stages in conjunction with a Transformer
- Used both built-in state files and database sequences to generate the surrogate keys
- Developed WebSphere Information Server DataStage jobs to do surrogate key management and updatable in memory lookups
- Used the new multi record format files enabled CFF stage to read the EBCDIC data files. This feature is exceptionally useful compared to DataStage Enterprise Edition.
- Used the new enhanced performance analysis and impact analysis features to improve the performance of the Parallel jobs and track the change impacts
- Developed a pilot project for AT& Confidential specific requirements using DataStage to read relational data and write in to XML hierarchy output files and reading XML files to load relational tables
- Developed DataStage jobs to read XML files with multiple levels of iterations
- Developed jobs to read relational tables and write then in to XML output files using DataStage
- Trained and mentored AT& Confidential ’s employees on DataStage tool and best practices/patterns in ETL development using DataStage
- Designed and developed DataStage Shared Contained jobs for all the common modules to allow reusability of code
- Designed and developed DataStage jobs to allow restartablity
- Established best practices for DataStage PX jobs to ensure optimal performance, reusability, and restartability
- Worked with AT& Confidential design team which is responsible for all corporate ETL projects to come up with a High Level Design document (HLD) for DataStage projects
Environment: Confidential WebSphere Information Server DataStage 8.x, Ascential DataStage EE 7.5.1, C/C++, ORACLE, SQL Server, Sybase, SunOS, SQL, BO, Erwin, Windows, UNIX Shell Scripts, MS Office
Confidential,Livingston, NJ
DataStage/ ProfileStage Developer
Responsibilities:- Installed DataStage Enterprise Edition 7.5.1 on HP UNIX platform 11.11i on Development, UAT and Production
- Configured Parallel configuration files based on Client’s hardware configuration
- Configured databases connections to Oracle, Sybase and SQL server for DataStage
- Installed MetaStage, ProfileStage and AuditStage as per client needs
- Made suggestions to the client for moving on to Linux Grid and WebSphere DataStage 8.0
- Performed daily tasks of DataStage Administrator such as creating projects, doing backups, debugging aborted jobs, giving expert advice to developers on how to improve performance of their jobs
- Brought in to the project as DataStage expert to jump start the DataStage EE development effort for Customer Data Hub sitting on Oracle
- Trained and mentored CIT’s employees on DataStage tool and best practices/patterns in ETL development using DataStage
- Worked as a lead in designing the entire ETL architecture for Customer Data Hub
- Wrote a generic Unix script for executing DataStage jobs through Control M
- Wrote Oracle Stored procedures and executed from DataStage
- Designed and fine tuned DataStage PX jobs for loading Customer Data Hub with performance as a key factor
- Designed and developed DataStage Shared Container jobs for all the common modules (e.g. generation of surrogate key) to allow reusability of code
- Designed and developed DataStage PX jobs to allow re-startability in other words if the PX job flow aborts, the PX jobs were developed in such a way as to just restart the job that failed instead restarting the whole job flow
- Developed build-ops using C/C++ to simplify and reuse business logic
- Established best practices for DataStage PX jobs to ensure optimal performance, reusability, and re-startability before development effort started
- Designed and developed PX jobs to load Customer records from source to Staging and then to Data Warehouse
- Fully designed and developed purge process for the staging area fact and dimension tables using combination of DataStage PX jobs and Unix scripts
- Designed and developed common modules for error checking (e.g. to check if the reject records output file is empty and to check if there are duplicate natural keys in a given table)
- Executed ProfileStage Column Analysis, Table Analysis, Primary Key Analysis, Cross Table Analysis, Relationship Analysis
- Generated Specifications, Transformation Mappings and DDLs of Confidential database
- Used ProfileStage to generated simple DataStage jobs to load data from source to Confidential
- Installed ProfileStage, PSDB, Analysis Server and Message switch on the SQL Server and HP Unix architecture
Environment: Ascential DataStage EE 7.5.1, Ascential ProfileStage, Ascential MetaStage, Ascential AuditStage, Ascential QualityStage, C/C++, ORACLE, SQL Server, Sybase, Linux, HP UNIX, SQL, BO, Erwin, Windows, UNIX Shell Scripts, MS Office
Confidential, Minneapolis, MN
DataStage Developer
Responsibilities:- Brought in to the project as DataStage expert to jump start the DataStage PX development effort for Enterprise Data Warehouse (EDW).
- Trained and mentored Confidential ’s employees on DataStage tool and best practices/patterns in ETL development using DataStage.
- Developed a pilot project using DataStage PX to load three years of Stock Ledger Data as proof of concept for the business clients.
- Designed and fine tuned DataStage PX jobs for loading Stock Ledger history with performance as a key factor.
- Designed and developed DataStage PX jobs to load about 6 millions records in one hour from Source to Staging area and to Warehouse.
- Designed and developed DataStage Shared Contained jobs for all the common modules (e.g. generation of surrogate key) to allow reusability of code.
- Designed and developed DataStage PX jobs to allow restartability in other words if the PX job flow aborts, the PX jobs were developed in such a way as to just restart the job that failed instead restarting the whole job flow.
- Developed build-ops using C/C++ to simplify and reuse business logic.
- Established best practices for DataStage PX jobs to ensure optimal performance, reusability, and restart ability before Release 1 development effort started.
- Designed and developed PX jobs to load Stock Ledger transactions from source to landing zone and then to Data Warehouse.
- Designed fully and developed purge process for the staging area fact and dimension tables using combination of DataStage PX jobs and UNIX scripts.
- Designed and developed common modules for error checking (e.g. to check if the reject records output file is empty and to check if there are duplicate natural keys in a given table).
- Worked on testing the DataStage 7.5 upgrade from DataStage 7.0 on R&D, Stage and Production. Testing involved running all the PX jobs from Release 1 on DataStage 7.5.
- Worked on developing production support hand over document for Release 1 with commonly encountered problems and how to debug them
Environment: Ascential DataStage PX 7.0.1, Ascential DataStage PX 7.5.1, Ascential DataStage PX 7.0.1, C/C++, DB2/UDB, ORACLE, Teradata, SQL Server, AIX, SQL, Essbase, Erwin, Windows, UNIX Shell Scripts, MS Office