Sr. Etl Developer/architect Resume
Minneapolis, MN
SUMMARY:
- Over 9+ years of IT experience with high performance in Analysis, design, development and implementation of Relational Database and Data Warehousing Systems utilizing ETL Tools - Informatica, IBM Infosphere DATASTAGE, BigData and Hadoop.
- Extensive experience in using Informatica PowerCenter tool, for implementation of ETL methodology in DataExtraction, Transformation and Loading.
- Experienced in integration of various data sources with Multiple Relational Databases like Oracle, SQLServer, Teradata,Netezza, WSDL, XMLfiles, IBMDB2 and Worked on integrating data from flatfiles like fixed width and delimited.
- Experience in Data Warehouse development, worked with Data Migration, Data Conversion, and (ETL) Extraction/Transformation/Loading using Ascential DataStage.
- Experienced in Big data technologies- Pig, Hive, Sqoop, Flume, Oozie, NoSQL, databases (Cassandra &Hbase).
- Experience in FACT & Dimensions tables (SCD), Physical & logical data modeling with normalization process.
- Have clear understanding of Data warehousing, Data modeling and Business Intelligence concepts with emphasis on ETL and life cycle development using Informatica PowerCenter (Repository Manager, Designer, Workflow Manager, Metadata Manager and Workflow Monitor).
- Experienced in big data analysis and developing data models using Hive, Map reduce, SQL with strong data architecting skills designing data-centric solutions.
- Extensively used DataStage tools (Data Stage Designer, DataStage Administrator and DataStage Director).
- Experienced in DataWarehouse/Datamart, OLTP and OLAP implementations teamed with project scope, Analysis, requirements gathering, data modeling, Effort Estimation, ETL Design, development, System testing, Implementation and production support.
- Good knowledge of Hadoop (MapR) architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Excellent experience using Teradata SQL Assistant, data load/export utilities like BTEQ, FastLoad, Multi Load, Fast Export using Mainframes and UNIX
- Extensive experience in development, debugging, troubleshooting, monitoring and performance tuning using DataStage Designer, DataStage Director, DataStage Administrator.
- Excellent experience in writing SQL queries using SQL, SQL*Plus, PL/SQL, Procedures/Functions, and Triggers
- Experienced in working with Microstrategy 9x, Cognos report studio 8.1 utilizing Report Studio, and Query Studio, Analysis Studio.
- Strong experience in Relational Database concepts, Entity relations, worked with File Transfer Protocol (FTP) and Secure file transfer protocol(SFTP) to pull or send the files from one server to another server
- Hands on experience in data warehousing techniques such as data cleansing, Surrogate key assignment, Slowly Changing Dimensions SCD TYPE 1, SCD TYPE 2 and Change Data Capture (CDC)
- Extensive experience on ETL process consisting of Informatica Data Transformation(Filter, lookup, Sorter, Normalizer, Update strategy Transformation, Router)DataSourcing, MappingConversion and Loading
- Extensively used heterogeneous data sources, flat files to load data from source systems to targets using Direct and Indirect method
- Expertise in Enterprise Data Warehouse (EDW)SDLC and architecture of ETL, reporting and BItools.
- Experienced in UNIXshellscripting, CRON, FTP and file management in various UNIX environments.
- Experienced in using Informatica Data Quality IDQ for dataprofiling, standardization, enrichment, matching, and consolidation.
- Experienced in RalphKimballMethodology, BillInmonMethodology, creating entity relational and dimensional-relational table modeling using Data Modeling (Dimensional & Relational) concepts like StarSchemaModeling and Snow-flakeschema modeling.
- Experienced using SAS ETL tool, Talend ETL tool and SASEnterprise Data Integration Server highly preferred.
- Experienced on several complex mappings in Informatica, a variety of PowerCenter transformations, Mapping Parameters, Mapping Variables, Mapplets& Parameter files in Mapping Designer using both the InformaticaPowerCenter and IDQ.
- Extensive experience working in an Agile/Scrum, Waterfalldevelopment environment.
- Experience in Teradata client utilities like MULTI LOAD, FAST LOAD, FAST EXPORT and BTEQ.
TECHNICAL SKILLS:
ETL Tools: Informatica 10.1/9.6/9.5/8. x (Power Center), IDQ, Informatica BigData Edition, DataStage 11.1/8.5/8.1
Databases: Oracle 18c/12c/11g/10g, MS SQL Server 2016/2012/2008, DB2, Teradata 15.0/14, Netezza.
Operating Systems: Windows 8/7/XP, UNIX, Linux
Bigdata and Cloud: Sqoop, Hive, MongoDB, HBase, Kafka, HDFS, AWS S3, AWS Redshift, Athena
Programming Languages: SQL, PL/SQL, Base SAS, COBOL, HTML, Python, HiveQL
Database utilities: SQL*Plus, SQL Developer, TOAD, Teradata SQL Assistant,, MS SQL Server, COBOL, NetezzaAginity
Data Modeling: ERWIN, MS Visio, ER STudio
Scripting Languages: UNIX Shell Scripting, Power Shell, JCL
Methodologies: E-R Modeling, Star Schema, Snowflake Schema
Scheduling Tool: Autosys
Utilities: MULTI LOAD, FAST LOAD, FAST EXPORT and BTEQ.
Others: MS Word, Excel, Outlook, FrontPage, PowerPoint
PROFESSIONAL EXPERIENCE:
Confidential, Minneapolis MN
Sr. ETL Developer/Architect
Responsibilities:
- Analyzed the business requirements and framing the business logic for the ETL process and designed and Developed complex mappings, reusable Transformations for ETL using InformaticaPower Center 10.1.
- Designed the ETL processes using Informatica to load data from SQLServer, FlatFiles, XMLFiles and Excel files to Confidential Oracle database.
- Design & Develop ETL workflow using Oozie for business requirements, which includes automating the extraction of data from MySQL database into HDFS using Sqoop scripts.
- Used Informatica Power Center 10.1.0 to extract data from MDW2.1 and generate the various extracts.
- Created complex SCD type 1 & type 2 mappings using dynamic lookup, Joiner, Router, Union, Expression and Update Transformations.
- Working on a MapR Hadoop platform to implement Bigdata solutions using Hive, Map reduce, shell scripting, and java technologies.
- Developed jobs to send and read data from AWS S3 buckets using components like tS3Connection, tS3BucketExist, tS3Get, tS3Put.
- Used Teradata external loading utilities like Multi Load, TPUMP, Fast Load and Fast Export to extract from and load effectively into Teradata database
- Performed data manipulations using various Informatica Transformations like Aggregate, Filter, Update Strategy, and Sequence Generator etc.
- Wrote ETL jobs to read from web API using REST and HTTP calls and loaded into HDFS using java.
- Imported Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables and imported data using Sqoop from Teradata using Teradata connector.
- Responsibilities included designing and developing complex mappings using Informatica power center and Informatica developer (IDQ) and extensively worked on Address Validator transformation in Informatica developer (IDQ).
- Designed and Developed ETL strategy to populate the Data Warehouse from various source systems such as Oracle, Teradata, Netezza, Flat files, XML, SQL Server
- Migrated ETL jobs to Pig scripts to do Transformations, even joins and some pre-aggregations before storing the data to HDFS.
- Used Address Validator transformation in IDQ and passed the partial address and populated the full address in the Confidential table and created mappings in Informatica Developer (IDQ) using Parser, Standardizer and Address Validator Transformations.
- Working on Dimension as well as Fact tables, developed mappings and loaded data on to the relational database and created Informatica parameter files and User Defined Functions for handling special characters.
- Worked with existing Python Scripts, and made additions to the Python script to load data from CMS files to Staging Database and to ODS.
- Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
- Involved in writing Teradata SQL bulk programs and in Performance tuning activities for TeradataSQL statements using Teradata EXPLAIN and using Teradata Explain, PMON to analyze and improve query performance.
- Working with Data scientists on migration of traditional SAS code into Hive HQL to run on Hadoop platform with higher efficiency and less time.
- Worked on Informatica Power Center Designer - Source Analyzer, Warehouse Designer, Mapping Designer & Mapplet Designer and Transformation Developer
- Written SQL overrides in source Qualifier according to business requirements and Created Oracle Tables, Views, Materialized views and PL/SQL stored procedures and functions and Developed PL/SQL Business Functions, which were used with MicroStrategy Reports.
- Developed multiple MapReduce jobs in java for Data Cleaning and pre-processing analyzing data in PIG and analyzed WSDL interactions with Oracle on Demand to push as well as pull data via Informatica.
- Generated UNIX shell scripts for automating daily load processes and scheduled and unscheduled workflows and used UNIX command tasks to automate the entire process of fetching the source file from a different path and FTP it onto the server.
- Extensively used Aginity Netezza work bench to perform various DML, DDL etc operations on Netezza database.
- Created MicroStrategy objects like Metrics (Conditional, Transformational, Dimensional, and Compound), Filters, Prompts (Filter, Object), Templates and Reports.
- Sourced data form RDS and AWS S3 bucket and populated in Teradata Confidential and m ounted S3 bucket in local UNIX environment for data analysis.
- Estimates and planning of development work using Agile Software Development
Environment: Informatica Power Center 10.1, DB2, Erwin 9.7, UNIX Shell Scripting, Oracle 18c/12c, PL/SQL, Business Objects XI R2, SQL Server 2016,Korn Shell Scripting, Teradata, Netezza, SQL, T-SQL, Teradata SQL Assistant, Tivoli Workload Scheduler 9.2, Microstrategy, Tidal Job Scheduler, SSRS, Crystal Reports 16
Confidential, Mentor, OH
Sr. ETL/Informatica Developer
Responsibilities:
- Prepared High Level Design and Low Level Design based on Functional and Business requirements document of the project.
- Creating the ETL mappings using various Informatica transformations: Source qualifier, Data Quality, Lookup, Expression, Filter, Router, Sorter, Aggregator etc.
- Imported data from RDBMS environment into HDFS using Sqoop for report generation and visualization purpose using Tableau.
- Developed processes on both Teradata and Oracle using shell scripting and RDBMS utilities such as Multi Load, Fast Load, Fast Export, BTEQ (Teradata) and SQL*Plus, SQL*Loader (Oracle).
- Created and maintained, detailed support documentation for all ETL processes, developed solutions, including detailed flow designs and drafts.
- Worked with the DW architect to prepare the ETL design document and developed transformation logic to cleanse the source data of inconsistencies during the source to stage loading
- Connected MySQL and Oracle RDS through Informatica Cloud Connector and move data into Redshift as well as in local environment
- Worked in Loading and transforming large sets of structured, semi structured and unstructured data.
- Provided extensive Production Support for Data Warehouse for internal and external data flows toNetezza, Oracle DBMS from ETL servers via remote servers.
- Designed and developed an entire DataMart from scratch and designed, developed and automated the Monthly and weekly refresh of the Datamart and Developed several complex Mappings, Mapplets and Reusable Transformations to facilitate one time, Weekly, Monthly and daily loading of Data.
- Developed processes on Teradata using shell scripting and RDBMS utilities such as Multi Load, Fast Load, Fast Export, BTEQ (Teradata).
- Developed Informatica mappings, transformation, reusable objects by using mapping designer, and transformation developer and Mapplet designer in InformaticaPowerCenter.
- Used InformaticaPowerCenter for extraction, loading and transformation (ETL) of data in the data warehouse.
- Worked on ICRT Web API to enable the JDBC connections, Published metadata, Service connectors to interact with SOAP and other Web API and performed Extract, Transform and Load between traditional RDBMS and Hive , using Informatica BDE.
- Created various Oracle database SQL, PL/SQL objects like Indexes, stored procedures, views and functions for Data Import/Export.
- Handled Informatica administration work like migrating the code using Export/Import & Informatica Deployment groups, creation of users, creating folders, Worked on Shortcuts across shared and non-shared folders
- Involved in writing the Test Cases and also assisted the users in performing UAT and extensively used UNIX shell scripts to create the parameter files dynamically and scheduling jobs using TWS scheduler.
- Tested all the reports by running queries against the warehouse using SQL Navigator. Also compared those queries with the MicroStrategy SQL engine generated queries.
- Responsible for determining the bottlenecks and fixing the bottlenecks with performance tuning using Netezza Database.
- Imported Data from Different Relational Data Sources like RDBMS, Teradata to HDFS using Sqoop.
- Used Informatica Power Center Workflow manager to create sessions, batches to run with the logic embedded in the mappings.
- Created complex mappings in Power Center Designer using Aggregate, Expression, Filter, Sequence Generator, Lookup, Joiner and Stored procedure transformations.
- Developed mappings to load Fact and Dimension tables, SCD Type 1 and SCD Type 2 dimensions and Incremental loading and unit tested the mappings
- Attended SCRUM meetings regularly to discuss the day-to-day progress of the individual teams and the overall project.
- Extensively used Normal Join, Full Outer Join, Detail Outer Join, and master Outer Join in the Joiner Transformation.
- Worked with Scheduler to run session on daily basis and send email after the completion of loading and preparing the unit test cases for all the mappings to test the code.
- Coding using BTEQ SQL of TERADATA, writes UNIX scripts to validate, format and execute the SQL's on UNIX environment.
- Tuned OBIEE Reports and designed Query Caching and Data Caching for Performance Gains and created integration services, repository services and migrated the repository objects.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing
- Used heterogeneous data sources XML Files and Flat Files as source also imported stored procedures from Oracle for transformations.
Environment: Informatica Power Center 9.6(Repository Manger, Designer, Workflow Monitor, Workflow Manager), UNIX Shell Scripting, Oracle 12c Teradata, Flat files, MicroStrategy, DB2, SQL, Erwin, SQL, PL/SQL, T-SQL, SSRS, NetezzaAginity, Teradata SQL Assistant.
Confidential, Horsham, PA
Sr. ETL Developer
Responsibilities:
- Worked in all the stages of the SDLC like Business, Functional, Technical Requirements Gathering, Designing, Documenting, Developing and Testing.
- Designed and Developed Datastage Jobs to extract data from heterogeneous sources, Applied transform logics to extracted data and Loaded into Data Warehouse Databases.
- Created DataStage jobs using different stages like Transformer, Aggregator, Sort, Join, Merge, Lookup, Data Set, Funnel, Remove Duplicates, Copy, Modify, Filter, Change Data Capture, Change Apply, Sample, Surrogate Key, Column Generator, Row Generator, Etc.
- Extracted data from flat files, XML files and Oracle database, applied business logic to load them in the central Oracle database and understanding the business requirements, developing design specifications for enterprise applications using Teradata.
- Worked on different data sources such as Oracle, SQL Server, Flat files etc. and extensively used various transformations like Lookup, Update Strategy, Joiner, Aggregator, Union and few other transformations.
- Populate or refresh Teradata tables using Fast load, Multi load & Fast export utilities for user Acceptance testing.
- Wrote SQL queries and PL/SQL procedures to perform database operations according to business requirements.
- Extensively used OBIEE 11G for customizing and modifying the Physical layer, BMM and Presentation layers of the repository (.rpd)
- Involved in performance tuning and optimization of DataStage mappings using features like Pipeline and Partition Parallelism and data/index cache to manage very large volume of data.
- Wrote SQL, PL/SQL, stored procedures and triggers, cursors for implementing business rules and transformations and created complex T-SQL queries and functions. .
- Extensively used Netezza utilities like NZLOAD and NZSQL and loaded data directly from Oracle to Netezza without any intermediate files.
- Used DataStage stages namely Hash file, Sequential file, Transformer, Aggregate, Sort, Datasets, Join, Lookup, Change Capture, Funnel, Peek, Row Generator stages in accomplishing the ETL Coding.
- Created design and technical specification documents for realizing MicroStrategy objects to be integrated in the report.
- Documented logical, physical, relational and dimensional data models. Designed the data marts in dimensional Data modeling using star and snowflake schemas.
- Scheduling jobs in ZENA scheduler and involved heavily in writing complex SQL queries based on the given requirements such as complex Teradata Joins, Stored Procedures, Macros, etc.
- Involved in setting up Data level security for the OBIEE Reports and also handled the security and privileges from the Database to the Oracle repository.
- Used DataStage ETL tool to extract data from sources systems, loaded the data into the Teradata database.
- Involved in troubleshooting MicroStrategy Reports, by optimizing the SQL using VLDB properties for creating complex reports using Advanced Filters, Compound Metrics, Conditional, Transformation and Level Metrics.
- Extensive worked on debugging and troubleshooting the Sessions using the Debugger and Workflow Monitor and worked with Session Logs and Workflow Logs for Error handling and troubleshooting.
Environment: Data Stage 11.1/8.5, OBIEE, Teradata, workflow manager, workflow monitor Oracle 12c/11g, MicroStrategy, SFDC, TOAD, SQL, T-SQL, Teradata 14, Netezza, SQL Server, Windows XP, Reporting Services (SSRS), MS visual Studio, SSRS.
Confidential, San Mateo, CA
Sr. ETL Developer
Responsibilities:
- Responsible for requirement definition and analysis in support of Data Warehousing efforts and developed ETL mappings, transformations using DataStage.
- Used DataStage Director to Run and Monitor the jobs performed, automation of Job Control using Batch logic to execute and schedule various DataStage jobs.
- Responsible for error handling using Session Logs, Reject Files, and Session Logs in the Workflow Monitor.
- Design, developed and Unit tested SQL views using Teradata SQL to load data from source to Confidential and developed.
- Extensively used Flat File Stage, Hashed File Stage. DB2 UDB Stage, FTP Plug-in Stage and Aggregator Stage during ETL development.
- Extensively worked with the Debugger for handling the data errors in the mapping designer and created events and tasks in the work flows using workflow manager.
- Involved in Extracting, Transforming and Loading data from various RDBMS and Flat files to Data Marts (EDW) and worked extensively with Slowly Changing Dimensions i.e. Type1 & Type2.
- Documented the Data Warehouse development process and performed knowledge transfer to Business Intelligence Developer.
- Involved in defining source data integration mechanism and involved in reviewing data mapping between source and existing EDW entities.
- Developed different types of reports such as tabular, pivot table, charts and graphs using OBIEE analytics and BI Publisher.
- Created Data Synchronization scripts in ICRT to insert, update data into SalesForce based on the type of data that needed to be updated on SalesForce.
- Created Data Replication tasks in ICRT to extract the GUID's of SalesForce records to enable updates on SalesForce objects
- Extracted and Loaded data into Teradata using Fastload and Mload and designed data mappings for loading date for the Star Schema modeling, and FACT and Dimensions tables.
- Used DataStage Designer for developing various jobs to extract, cleansing, transforming, integrating and loading data into Data Warehouse.
- Created Pre/Post Session/SQL commands in sessions and mappings on the Confidential instance and involved in writing shell scripts for file transfers, file renaming and several other database scripts to be executed from UNIX.
- Worked with team of developers on Python applications for RISK management.
- Involved in Developing Reports by using Free Hand SQL Queries in MicroStrategy Desktop.
- Creating the scheduling plan, job execution timings and sharing with scheduling team (ASG-Zena) and developed Perl, UNIX, AIX shell scripts and updated the log for the backups.
- Reviewed mapping documents provided by Business Team, implemented business logic embedded in mapping documents into Teradata SQLs and loading tables needed for Data Validation.
- Created debugging sessions before the session to validate the transformations and also used existing mappings in debug mode extensively for error identification by creating break points.
- Worked closely with business analysts and gathered functional requirements. Designed technical design documents for ETL process.
- Developed Unit test cases and Unit test plans to verify the data loading process.
Environment: IBM DataStage 8.1, Oracle 10g, MicroStrategy, OBIEE, SQL Server 2010, Salesforce, UNIX Shell Script, Windows XP, SQL, Teradata, Netezza, DB2, SSRS, SSIS, T-SQL, PL/SQL.
Confidential
ETL Developer
Responsibilities:
- Designed & developed ETL processes based on business rules, job control mechanism using Informatica Power Center 8.6 and re-engineered on existing Mappings to support new/changing business requirements.
- Used Mapping, Sessions Variables/Parameters, and Parameter Files to support change data capture and automate workflow execution process to provide data processing.
- Tuned SQL Statements, Mappings, Sources, Targets, Transformations, Sessions, Database, Network for the bottlenecks, used Informatica parallelism options to speed up data loading to Confidential .
- Extensively worked on Expressions, Source Qualifier, Union, Filter, Sequence Generator, sorter, Joiner, Update Strategy Transformations.
- Developed SFTP/FTP re-usable processes to pull the files from External Systems.
- Developed Informatica Mappings to populate the data into dimension and Fact tables for data classifications to end developers.
- Implemented Real-Time Change Data Capture (CDC) for SalesForce.com (SFDC) sources usingInformatica Power Center and implemented Slowly Changing Dimension Type 1 for applying INSERT else UPDATE to Confidential tables.
- Modified several of the existing mappings based on the user requirements and maintained existing mappings, sessions and workflows.
- Created synonyms for copies of time dimensions, used the sequence generator transformation type to create sequences for generalized dimension keys, stored procedure transformation for encoding and decoding functions and Lookup transformation to identify slowly changing dimensions (SCD).
- Used various transformations like Source Qualifier, Expression, Aggregator, Joiner, Filter, Lookup, and Update Strategy for Designing and optimizing the Mapping.
- Tuned the performance of mappings by following Informatica best practices and also applied several methods to get best performance by decreasing the run time of workflows and created various tasks like Pre/Post Session, Command, Timer and Event wait.
- Extensively worked with various lookup caches like Static Cache, Dynamic Cache, and Persistent Cache and prepared SQL Queries to validate the data in both source and Confidential databases.
- Prepared the error handling document to maintain the error handling process and validated the Mappings, Sessions &Workflows, Generated& Loaded the Data into the Confidential database.
- Monitored batches and sessions for weekly and Monthly extracts from various data sources to the Confidential database.
Environment: Informatica Power Center 8.6, Oracle, Mainframe, DB2, COBOL, VSAM, SFTP, SQL, PL/SQL