Sr. Etl/informatica Developer Resume
Stamford, CT
SUMMARY:
- 8+ years of IT experience with high performance in Analysis, design, development and implementation of Relational Database and Data Warehousing Systems utilizing ETL Tools - Informatica, IBM Infosphere DATASTAGE, SSIS, AWS environment, BigData and Hadoop.
- Experienced in Big data technologies- Pig, Hive, Sqoop, Flume, Oozie, NoSQL, databases (MongoDB, Amazon DynamoDB &Hbase).
- Extensive experience in using Informatica Power Center tool, for implementation of ETL methodology in Data Extraction, Transformation and Loading.
- Experienced in integration of various data sources with Multiple Relational Databases like Oracle, SQL Server, Teradata, Netezza, WSDL, XML files, IBMDB2 and Worked on integrating data from flat files like fixed width and delimited.
- Have AWS Redshift db design and development and AWS S3 development experience.
- Proven expertise in all facets of Software Development Life Cycle (SDLC) like requirements gathering, designing, coding, testing, deployment and application production support.
- Have clear understanding of Data warehousing, Data modeling and Business Intelligence concepts with emphasis on ETL and life cycle development using Informatica PowerCenter (Repository Manager, Designer, Workflow Manager, Metadata Manager and Workflow Monitor) and SSIS.
- Experienced in big data analysis and developing data models using Hive, Map reduce, SQL with strong data architecting skills designing data-centric solutions.
- Expertise in Data Warehousing, Data Migration, Data Integration using Business Intelligence (BI) tools such as Informatica Power Center, Power Exchange CDC, B2B Data Transformation, InformaticaData Quality, Informatica Data Explorer, MDM, SSIS, OBIEE, Cognos, etc.
- Excellent knowledge of Informatica Administration and in grid management, creation and up gradation of repository contents, creation of folders and users and their permissions.
- Good knowledge of Hadoop (MapR) architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Experience with SSIS packages covering advanced task scenarios, parallelism, error handling, and scheduling.
- Experienced in Data Warehouse/Datamart, OLTP and OLAP implementations teamed with project scope, Analysis, requirements gathering, data modeling, Effort Estimation, ETL Design, development, System testing, Implementation and production support.
- Excellent experience using Teradata SQL Assistant, data load/export utilities like BTEQ, FastLoad, Multi Load, Fast Export using Mainframes and UNIX
- Excellent experience in writing SQL queries using SQL, SQL*Plus, PL/SQL, Procedures/Functions, and Triggers
- Experienced in ETL processes. Worked extensively on data extraction, data integration, and loading from different data sources using SSIS
- Hands on experience in data warehousing techniques such as data cleansing, Surrogate key assignment, Slowly Changing Dimensions SCD TYPE 1, SCD TYPE 2 and Change Data Capture (CDC)
- Extensive experience on ETL process consisting of Informatica Data transformation (Filter, lookup, Sorter, Normalizer, Update strategy Transformation, Router) Data Sourcing, Mapping Conversion and Loading.
- Expertise in Enterprise Data Warehouse (EDW) SDLC and architecture of ETL, reporting and BItools.
- Experienced in UNIXshellscripting, CRON, FTP and file management in various UNIX environments.
- Experienced in using Informatica Data Quality IDQ for data profiling, standardization, enrichment, matching, and consolidation.
- Experienced in Ralph Kimball Methodology, Bill Inmon Methodology, creating entity relational and dimensional-relational table modeling using Data Modeling (Dimensional & Relational) concepts like Star Schema Modeling and Snow-flake schema modeling.
- Experienced on several complex mappings in Informatica, a variety of PowerCenter transformations, Mapping Parameters, Mapping Variables, Mapplets & Parameter files in Mapping Designer using both the Informatica PowerCenter and IDQ.
- Extensive experience working in an Agile/Scrum, Waterfall development environment.
- Extensive experience in Salesforce.com Data Migration using Informatica with SFDC Connector and Apex Data Loader.
TECHNICAL SKILLS:
ETL Tools: Informatica 10/9.x/8.x (Power Center), IDQ, SQL Server Integration services, Ab initio, Informatica B2B 10.1/9.x
Databases: Oracle 12c/11g/10g/9i, MS SQL Server 2016/2014/2012/2008 , DB2, Teradata, Netezza,, Postgres SQL, MOngoDB, DynamoDB, HBase.
Operating Systems: Windows 8/7/XP, UNIX, Linux
Programming Languages: R, SQL, PL/SQL, Base SAS, COBOL, HTML
Database utilities: SQL*Plus, SQL Developer, TOAD, Teradata SQL Assistant,, MS SQL Server, COBOL, Netezza Aginity
Data Modeling: ERWIN 9.x.
BigData Tools: Hive, MapReduce, Pig, Sqoop, Oozie, and HDFS.
AWS: Amazon Redshift, Amazon DynamoDB, EC2, S3.
Scripting Languages: UNIX Shell Scripting, Power Shell, JCL
Methodologies: Ralph Kimball Methodology, Bill Inmon Methodology
Scheduling Tool: Autosys
Utilities: MULTI LOAD, FAST LOAD, FAST EXPORT and BTEQ.
Others: MS Word, Excel, Outlook, FrontPage, PowerPoint
PROFESSIONAL EXPERIENCE:
Confidential, Stamford, CT
Sr. ETL/Informatica Developer
Responsibilities:
- Interacted with business representatives for Need Analysis and to verify and understand Business and Functional Specifications and participated in the Design team and user requirement gathering meetings, and prepared technical and mapping documentation.
- Designed and Developed complex mappings, reusable Transformations for ETL using Informatica Power Center 9.6.1 and performed data manipulations using various Informatica Transformations like Aggregate, Filter, Update Strategy, and Sequence Generator etc.
- Design & Develop ETL workflow using Oozie for business requirements, which includes automating the extraction of data from MySQL database into HDFS using Sqoop scripts.
- Responsible for full data loads from production to AWS Redshift staging environment and complete Data loading from Postgresql to AWS Redshift Data Lake.
- Led B2B structured and unstructured transformations that included: resolving end user problems, on B2B transformations and resolving system failures.
- Created complex SCD type 1 & type 2 mappings using dynamic lookup, Joiner, Router, Union, Expression and Update Transformations.
- Working on a MapRHadoop platform to implement Bigdata solutions using Hive, Mapreduce, shell scripting, and java technologies.
- Used Teradata external loading utilities like Multi Load, TPUMP, Fast Load and Fast Export to extract from and load effectively into Teradata database
- Worked on Informatica B2B 10.1, PowerCenter Unstructured data Transformation (UDT) and made use of Mapper, Parser and Streamer components for working with XML files.
- Worked on Informatica cloud and extracted data from Sales Force source.
- Imported Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables and imported data using Sqoop from Teradata using Teradata connector.
- Responsibilities included designing and developing complex mappings using Informatica power center and Informatica developer (IDQ) and extensively worked on Address Validator transformation in Informatica developer (IDQ).
- Designed and Developed ETL strategy to populate the Data Warehouse from various source systems such as Oracle, Teradata, Netezza, Flat files, XML, SQL Server, Amazon DynamoDB, Hbase.
- Migrated ETL jobs to Pig scripts to do Transformations, even joins and some pre-aggregations before storing the data to HDFS and data structure used by NoSQL databases are different from those used by default in relational databases, making some operations faster in NoSQL
- Filtered XML files by using filter conditions on D9 segment and converted back the filtered xml files to EDI format using serializer in B2B data transformation.
- Used of Informatica Cloud is easily connected to variety of cloud on-premises, mobile and social data sources.
- Extract, Transform and Loading of fixed delimited files to AWS Redshift Tables using Informatica.
- Extensively used Change data capture (CDC) concept in Informatica as well as in the Oracle Database to capture the changes to the datamart. Change data capture enabled us to reduce the time taken to load the data into the data mart by allowing only the changed data.
- Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs and developed multiple MapReduce jobs in java for Data Cleaning and pre-processing analyzing data in PIG.
- Involved in writing Teradata SQL bulk programs and in Performance tuning activities for Teradata SQL statements using Teradata EXPLAIN and using Teradata Explain, PMON to analyze and improve query performance.
- Working with Data scientists on migration of traditional SAS code into Hive HQL to run on Hadoop platform with higher efficiency and less time.
- Automated the code deployment and EC2 provisioning using Ansible and Terrafoam and performed match/merge, run match rules to check the effectiveness of MDM on data, and fine- tuned match rules.
- Translate, load and exhibit unrelated data sets in various formats and sources like JSON, text files, and Kafka queues
- Written SQL overrides in source Qualifier according to business requirements and Created Oracle Tables, Views, Materialized views and PL/SQL stored procedures and functions
- Comfortable in implementing the CDC (Change data capture) for the slowly changing dimensions of types SCD-Type1, SCD-Type2, SCD-Type3.Used effective Start-Date, End-Date, Version, Flagging to capture the change for the records.
- Created the Reports using Business Objects functionalities like Multiple Data Providers, Prompts, Slice and Dice and Drill Down.
- Extensively used Aginity Netezza work bench to perform various DML, DDL etc operations on Netezza database.
- Estimates and planning of development work using Agile Software Development
Environment: Informatica Power Center 9.6.1, Informatica BDE, Informatica B2B 10.1, Erwin r9.5, R, UNIX Shell Scripting, Oracle 12c, PL/SQL, Business Objects XI R2, SQL Server 2016/14,Korn Shell Scripting, Informatica cloud, Hive, B2B, Hadoop, MongoDB, HBase, AWS, Teradata, Netezza, SQL, T-SQL, Teradata SQL Assistant, Postgres SQL, Autosys, Informatica B2B, SSRS, Tableau, SSIS, Amazon Redshift, Amazon DynamoDB, S3.
Confidential, Dallas, TX
Sr. ETL/Informatica Developer
Responsibilities:
- Prepared High Level Design and Low Level Design based on Functional and Business requirements document of the project.
- Creating the ETL mappings using various Informatica transformations: Source qualifier, Data Quality, Lookup, Expression, Filter, Router, Sorter, Aggregator etc.
- Imported data from RDBMS environment into HDFS using Sqoop for report generation and visualization purpose using Tableau.
- Developed processes on both Teradata and Oracle using shell scripting and RDBMS utilities such as Multi Load, Fast Load, Fast Export, BTEQ (Teradata) and SQL*Plus, SQL*Loader (Oracle).
- Involved in Extraction, Transformation and Loading of data across different platforms including Hadoop, Big Data database.
- Worked with the DW architect to prepare the ETL design document and developed transformation logic to cleanse the source data of inconsistencies during the source to stage loading.
- Involved in Installing and Configuring of Informatica MDM Hub Console, Hub Store, Cleanse and Match Server, Address Doctor, Informatica PowerCenter applications.
- Created mappings and workflows to load the data from different sources like Oracle, SQL Server, Flat File and Amazon RDS to hive target in Hadoop and Netezza targets through Informatica Power Center.
- Configured and used B2B data exchange for end to end data visibility through event monitoring and to provide a universal data transformation supporting numerous formats, documents, and filters.
- Worked in Loading and transforming large sets of structured, semi structured and unstructured data and used of B2B database for integration of data and transformation of data
- Provided extensive Production Support for Data Warehouse for internal and external data flows to Netezza, Oracle DBMS from ETL servers via remote servers.
- Coordinating the data execution and loading in ETL with the Big Data Framework like HDFS, Hive, Hbase etc.
- Designed and developed an entire DataMart from scratch and designed, developed and automated the Monthly and weekly refresh of the Datamart and Developed several complex Mappings, Mapplets and Reusable Transformations to facilitate one time, Weekly, Monthly and daily loading of Data.
- Developed processes on Teradata using shell scripting and RDBMS utilities such as Multi Load, Fast Load, Fast Export, BTEQ (Teradata).
- Developed Informatica mappings, transformation, reusable objects by using mapping designer, and transformation developer and Mapplet designer in Informatica Power Center.
- Used Informatica Power Center for extraction, loading and transformation (ETL) of data in the data warehouse.
- Worked on building ETL data flows that works natively on HADOOP and developed multiple MapReduce jobs in Java for data cleaning and preprocessing
- Created various Oracle database SQL, PL/SQL objects like Indexes, stored procedures, views and functions for Data Import/Export.
- Optimization and performance tuning of Hive QL, formatting table column using Hive functions
- Involved in writing the Test Cases and also assisted the users in performing UAT and extensively used UNIX shell scripts to create the parameter files dynamically and scheduling jobs using TWS scheduler.
- Responsible for determining the bottlenecks and fixing the bottlenecks with performance tuning using Netezza Database.
- Imported Data from Different Relational Data Sources like RDBMS, Teradata to HDFS using Sqoop.
- Created complex mappings in Power Center Designer using Aggregate, Expression, Filter, and Sequence Generator, Lookup, Joiner and Stored procedure transformations.
- Performed land process to load data into landing tables of MDM Hub using external batch processing for initial data load in hub store.
- Created pig/hive scripts to ingest, extract and managing Data over HDFS and responsible for designing and managing Sqoop jobs that uploaded the data from Oracle to HDFS.
- Developed mappings to load Fact and Dimension tables, SCD Type 1 and SCD Type 2 dimensions and Incremental loading and unit tested the mappings
- Developing Informatica mappings and also tuned for better performance, Designing and Developed ETL logic for implementing CDC by tracking the changes in critical fields required by the user
- Data was loaded into HBase for reporting and for the business users to analyze and visualize the data.
- Extract data from flat files, Oracle, Salesforce.com, EBS and SQL Server 2008, and to load the data into the target database.
- Coding using BTEQ SQL of TERADATA, writes UNIX scripts to validate, format and execute the SQL's on UNIX environment.
- Write and maintain ETL scripts written in PostgreSQL to pull the data from source systems and load into staging area and then into data warehouse. ETLs are written in SOA concepts.
Environment: Informatica Power Center 9.5(Repository Manger, Designer, Workflow Monitor, Workflow Manager), Informatica BDE, UNIX Shell Scripting, Oracle 12c Teradata, Salesforce, Informatica cloud, Flat files, B2B, MicroStrategy, Postgres SQL Informatica B2B, DB2, SQL, Erwin, SQL, PL/SQL, T-SQL, SSRS, SSIS, Netezza Aginity, Ab initio, Teradata SQL Assistant, Tableau, Amazon DynamoDB, Redshift, HBase, MongoDB, Hive, Pig, MapReduce, Oozie, Sqoop and EC2.
Confidential, San Raman, CA
Sr. ETL/SSIS Developer
Responsibilities:
- Extracted data from flat files, XML files and Oracle database, applied business logic to load them in the central Oracle database.
- Design and develop SSIS (ETL) packages for loading data from Oracle and Flat files (3GB) to SQL Server Database.
- Understanding the business requirements, developing design specifications for enterprise applications using Teradata.
- Created packages in SSIS with error handling as well as complex SSIS packages using various Data transformations like Lookup, Script Task, and Conditional Split, Derived Column, Fuzzy Lookup, and Partition task, For-Each loop container in SSIS; scheduled the same SSIS packages by creating the job tasks.
- Worked on different data sources such as Oracle, SQL Server, Flat files etc.
- Extensively used various transformations like Lookup, Update Strategy, Joiner, Aggregator, Union and few other transformations.
- Developed table partitioning scripts for production database in Postgres.
- Implemented CDC using Teradata TPT load utility since volumes were high.
- Populate or refresh Teradata tables using Fast load, Multi load & Fast export utilities for user Acceptance testing.
- Wrote SQL queries and PL/SQL procedures to perform database operations according to business requirements.
- Involved in Processing claims and verifying whether it's delivered to FACETS System or not.
- Designed SSIS package for automatic data loading from FTP site to SQL server and scheduling the SSIS packages to run in specified manner by using the SQL Server Agent Service.
- Worked with web services. Created SSIS Packages to migrate slowly changing dimensions.
- Wrote SQL, PL/SQL, stored procedures and triggers, cursors for implementing business rules and transformations and created complex T-SQL queries and functions. .
- Provided support to develop the entire warehouse architecture and plan the ETL process.
- Designed SSIS packages to Move the fine tuned data from various heterogeneous data sources and flat files, Excel sheets to the Data Marts.
- Tested the Inbound / Outbound Interfaces to Facets and populated XML files.
- Extensively used Netezza utilities like NZLOAD and NZSQL and loaded data directly from Oracle to Netezza without any intermediate files.
- Created unit test packages for testing the SSIS packages as part of automated unit testing - inserting test data, execution of the package, verifying the results, Fail or pass the package based on results, cleanse the test data.
- Extensively tested the Business Objects report by running the SQL queries on the database by reviewing the report requirement documentation.
- Documented logical, physical, relational and dimensional data models. Designed the data marts in dimensional Data modeling using star and snowflake schemas.
- Worked with Index Cache and Data Cache in cache using transformation like Rank, Lookup, Joiner, and Aggregator Transformations.
- Responsible for the management and migration of SSIS packages in the staging and pre-production environments.
- Extensive worked on debugging and troubleshooting the Sessions using the Debugger and Workflow Monitor.
- Extensively designed the packages and data mapping using Control flow task, Sequence container task, Dataflow Task, Execute SQL Task, Data conversion task, Derived Column task and Script Task in SSIS Designer.
- Creating and maintaining jobs and package using SSIS and scheduling those jobs and package on Daily, Weekly or monthly basis.
- Worked with Session Logs and Workflow Logs for Error handling and troubleshooting.
Environment: SSIS, Oracle 11g, OBIEE 10g, SFDC, TOAD, SQL, T-SQL, Postgres SQL, Informatica B2B, Teradata, Netezza, SQL Server, Windows XP, Business Objects, Reporting Services (SSRS), Facet, Ab initio, MS visual Studio, SSRS, SSIS.
Confidential, Chicago, IL
Sr. ETL/SSIS Developer
Responsibilities:
- Responsible for requirement definition and analysis in support of Data Warehousing efforts and developed ETL mappings, transformations using SSIS.
- Extensively used ETL Tool SSIS to load data from Flat Files, Oracle and SQL server.
- Involved in processing claims in FACETS and validating the full cycle process to make sure the checks are generated.
- Responsible for error handling using Session Logs, Reject Files, and Session Logs in the Workflow Monitor.
- Design, developed and Unit tested SQL views using Teradata SQL to load data from source to target.
- Developed and tested all the SSI mappings and update processes and developed reusable Mapplets and Transformations.
- Designed and implemented CDC backup process for Mainframes.
- Extensively worked with the Debugger for handling the data errors in the mapping designer and created events and tasks in the work flows using workflow manager.
- Extracted and Loaded data into Teradata using Fastload and Mload.
- Designed data mappings for loading date for the Star Schema modeling, and FACT and Dimensions tables.
- Created Pre/Post Session/SQL commands in sessions and mappings on the target instance.
- Involved in writing shell scripts for file transfers, file renaming and several other database scripts to be executed from UNIX.
- Design, analyze and performed Integration and wrote System requirements on different leading health care software's such as FACETS.
- Reviewed mapping documents provided by Business Team, implemented business logic embedded in mapping documents into Teradata SQLs and loading tables needed for Data Validation.
- Created mappings to implement change data capture (CDC) logic for some of the dimensions using SSIS
- Worked with several facets of the SSIS tool - Source Analyzer, Data Warehousing Designer, Mapping & Mapplet Designer and Transformation Designer. Development of SSIS mappings for better performance.
- Worked closely with business analysts and gathered functional requirements. Designed technical design documents for ETL process.
- Developed Unit test cases and Unit test plans to verify the data loading process.
Environment: SSIS,, Oracle 10g, SQL Server 2008, Business Objects, UNIX Shell Script, Windows XP, SQL, Teradata, Netezza, DB2, Facet, SSRS, SSIS, Ab initio, T-SQL, PL/SQL.
Confidential
ETL Developer
Responsibilities:
- Designed & developed ETL processes based on business rules, job control mechanism usingInformatica Power Center 8.1.
- Re-engineered on existing Mappings to support new/changing business requirements.
- Used Mapping, Sessions Variables/Parameters, and Parameter Files to support change data capture and automate workflow execution process to provide data processing.
- Tuned SQL Statements, Mappings, Sources, Targets, Transformations, Sessions, Database, Network for the bottlenecks, used Informatica parallelism options to speed up data loading to target.
- Extensively worked on Expressions, Source Qualifier, Union, Filter, Sequence Generator, sorter, Joiner, Update Strategy Transformations.
- Developed SFTP/FTP re-usable processes to pull the files from External Systems.
- Developed Informatica Mappings to populate the data into dimension and Fact tables for data classifications to end developers.
- Created parsers in Informatica B2B transformation studio to pull outer layers from the XML files to the simplified XML files.
- Modified several of the existing mappings based on the user requirements and maintained existing mappings, sessions and workflows.
- Created synonyms for copies of time dimensions, used the sequence generator transformation type to create sequences for generalized dimension keys, stored procedure transformation for encoding and decoding functions and Lookup transformation to identify slowly changing dimensions (SCD).
- Used various transformations like Source Qualifier, Expression, Aggregator, Joiner, Filter, Lookup, and Update Strategy for Designing and optimizing the Mapping.
- Created various tasks like Pre/Post Session, Command, Timer and Event wait.
- Tuned the performance of mappings by following Informatica best practices and also applied several methods to get best performance by decreasing the run time of workflows.
- Prepared SQL Queries to validate the data in both source and target databases.
- Extensively worked with various lookup caches like Static Cache, Dynamic Cache, and Persistent Cache.
- Prepared the error handling document to maintain the error handling process.
- Validated the Mappings, Sessions &Workflows, Generated& Loaded the Data into the target database.
- Monitored batches and sessions for weekly and Monthly extracts from various data sources to the target database.
Environment: Informatica Power Center 8.1, Oracle, Ab initio, Mainframe, DB2, COBOL, VSAM, SFTP, SQL, PL/SQL