Informatica Etl/ Hadoop Engineer Resume
Pleasanton, CA
PROFESSIONAL SUMMARY:
- Over 10 years of IT experience in all phases of Software Development Life Cycle (SDLC) which includes User Interaction, Business Analysis/Modeling, Design, Development, Integration, Planning and testing and documentation in data warehouse applications, ETL processing and distributed applications.
- Strong expertise in using ETL Tool Informatica Power Center 8.6 /9 (Designer, Workflow Manager, Repository Manager, Data Quality (IDQ) and ETL concepts.
- Performed the data profiling and analysis making use of Informatica Data Quality (IDQ).
- Expertise in create a single master page reference by use of Master Data Management (MDM)
- Worked with various transformations like Normalizer, expression, rank, filter, group, aggregator, lookup, joiner, sequence generator, sorter, sql, stored procedure, Update strategy, Source Qualifier.
- Experienced in Teradata SQL Programming.
- Worked with Teradata utilities like Fast Load and Multi Load and Tpump and Teradata Parallel transporter.
- Experienced with the MPP Dataware house Vertica.
- For mapping the data Metadata helps to transforms operational environment to data ware house environment.
- Experienced in using advanced concepts of Informatica like PUSH DOWN OPTIMIZATION (PDO).
- Experienced in Teradata Parallel Transporter (TPT). Used full PDO on Teradata and worked with different Teradata load operators.
- Designing and developing informatica mappings including Type - I, Type-II, Type-III slowly changing dimensions (SCD).
- Validating data files against their control files and performing technical data quality checks to certify source file usage.
- Data Modeling: Data modeling knowledge in Dimensional Data modeling, Star Schema, Snow-Flake Schema, FACT and Dimensions tables.
- Experienced in writing SQL, PL/SQL programming, SQL server integration services(SSIS), Stored Procedures, Functions, Triggers, Views ,Materialized Views.
- Involved in the development of Informatica mappings and also tuned for better Performance.
- Good hands on experience in writing UNIX shell scripts to process Data Warehouse jobs.
- Experience in working with big data Hadoop stack tools like HDFS, HIVE, Pig, Sqoop.
- Experience in developing ETL mappings, transformations and implementing source and target definitions in Talend.
- Expert in importing and exporting data into HDFS and Hive using Sqoop.
- Expert in writing HiveQL queries.
- Experience in performance tuning the HiveQL and Pig scripts.
- Extensive experience with Data Extraction, Transformation, and Loading (ETL) from disparate data sources like Multiple Relational Databases (Teradata, Oracle, SQL SERVER, DB2), VSAM and Flat Files.
- Applied various techniques at both database level and application level to find the bottle necks and to improve performance.
- Coordinating with Business Users, functional Design team and testing team during the different phases of project development and resolving the issues.
- Good skills in defining standards, methodologies and performing technical design reviews.
- Executed software projects for Banking and financial services.
- Good communication skills, interpersonal skills, self-motivated, quick learner, team player.
TECHNICAL SKILLS:
ETL Tools: Informatica Power Center 9.5/9.1/8.6/7.1 (PC and IDQ), MDM,Data Stage 8.7.
Languages: SQL ,PLSQL,UNIX Shell Scripting
Methodology: Agile RUP, SCRUM, Waterfall
Databases: Teradata, Oracle , MPP, DB2 , SQL SERVER( SSIS)
Operating Systems: Windows, UNIX, Linux
IDEs: Eclipse ,PL/SQL Developer, TOAD, Teradata SQL Assistant
Reporting Tools: Crystal Reports
Scheduling Tools: Control-m , Autosys, Tidal
Big Data Technologies: Hadoop, HDFS, Map Reduce, Hive,Talend, Pig, HBase, Sqoop, Oozie
PROFESSIONAL EXPERIENCE:
Confidential, Pleasanton, CA
Informatica ETL/ Hadoop Engineer
Environment: Informatica 9.5/9.x, Hadoop 2.2.x and HDFS, MDM,IDQ, Sqoop, HIVE ,Talend, Pig, Teradata, SQL Assistant (SSIS), Oracle, Tidal, Unix.
Responsibilities:
- Design & Development of ETL mappings using Informatica 9.5.
- Provide technical support to ETL applications on Informatica 9.5, UNIX and Oracle.
- Implement Data Quality Rules using IDQ to check correctness of the source files and perform the data cleansing/enrichment
- Involved in massive data profiling using IDQ (Analyst tool) prior to data staging.
- Created profiles and score cards for the users using IDQ.
- Used MDM (Master data Management) to analyze data for decision support system.
- Expertise in Master Data Management concepts, Methodologies and ability to apply this knowledge in building MDM solutions.
- Migration of informatica jobs to Hadoop sqoop jobs and load into oracle database.
- Tuning the HQL queries.
- Preparation and Review of Project Macro & Micro design based on the LM solution outline document.
- Validating data files against their control files and performing technical data quality checks to certify source file usage.
- Involved in designing the Mapplet and reusable transformation according to the business needs.
- Designing and developing informatica mappings including Type-I ,Type-II,Type-III slowly changing dimensions(SCD).
- Effectively used various tasks (Reusable & Non Reusable), Command, Assignment, Decision, Event Raise, Event wait, Email.
- Identified of performance bottlenecks, tuning queries, suggesting and implementing alternative approaches like range partitioning of tables.
- For Reporting we have used the Metadata.
- For the summarization of high detailed data we have used Metadata.
- Coding & testing the Informatica Objects & Reusable Objects as per Liberty Mutual BI standards.
- Attend Technical meetings & discussions.
- Prepare High Level and Low Level Design Documents.
- Have experience with the scheduling tool Tidal.
- Worked with Teradata sources/targets.
- Push data as delimited files into HDFS using Talend Big data studio.
- Creating Hive/PIG Scripts, Identifying parameters as per requirement to apply transformation and perform Unit testing on data as per design.
- Created Hive managed and external tables.
- Performance tuning the hive queries
- Created pig scripts to process the files.
- Used HDFS system to copying files from local to HDFS file system.
Confidential, Columbus, OH
Senior Informatica Developer
Environment: Informatica 9.1(PC & IDQ), MPP data warehouse (Vertica), HIVE ,HDFS, Talend, Pig ,MDM,Oracle 10g, Teradata SQL Assistant, UNIX, CITRIX.
Responsibilities:
- Coordinating with Onsite Team and client for Requirements gathering and analysis.
- Understanding and developing the JPMC ETL framework for informatica objects as per coding standards.
- Performed the data profiling and analysis making use of Informatica Data Quality (IDQ).
- Implement Data Quality Rules using IDQ to check correctness of the source files and perform the data cleansing/enrichment.
- Experience in installation and configuration of core Informatica MDM Hub components such as Hub Console, Hub Store, Hub Server, Cleanse Match Server and Cleanse Adapter in Windows.
- Designing and developing informatica mappings including Type-I ,Type-II,Type-III slowly changing dimensions(SCD).
- Creating mapping from Source to Target in Talend.
- Load and transform data into HDFS from large set of structured data Oracle/Sql server using Talend Big data studio.
- Optimization and performance tuning of Hive QL, formatting table column using Hive functions.
- Have been involved in designing & creating hive tables to upload data in Hadoop and process like merging, sorting and creating, joining tables.
- Coding & testing the Informatica Objects & Reusable Objects as per JPMC's BI standards.
- Participating in peer review of informatica objects.
- For the maintenance of strong customer’s base we have used Vertica.
- Estimating volume of work & Deriving delivery plans to fit into overall planning.
- Prepared ETL Build Peer Review Checklist’s and Unit Test Case Templates for different work packages.
- Involved in Unit Testing, Integration Testing and System Testing.
Confidential, EI Paso, TX
Senior Informatica Developer
Environment: Informatica 8.6.1, Teradata, Oracle 9i, SQL Developer, Teradata SQL Assistant, SSIS, UNIX.
Responsibilities:
- Co-coordinating the onsite and off-shore teams on a daily basis
- Developing maps and workflows in Informatica to load data into Teradata.
- Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Router and Aggregator to create robust mappings in the Informatica Power Center Designer.
- Designing and developing informatica mappings including Type-I ,Type-II, Type-III slowly changing dimensions(SCD)
- Unit Testing the maps & workflows.
- Created PL/SQL objects stored procedure , function , view , materialized view .
- Coding & testing the Informatica Objects & Reusable Objects as per Frost BI standards.
- Extensive performance tuning by determining bottlenecks at various points like targets, sources, mappings and sessions.
- Attend Technical meetings & discussions.
- Created Complex mappings using Unconnected, Lookup, and Aggregate and Router transformations for populating target table in efficient manner.
- Maintaining documentation.
Confidential
Informatica Developer
Environment: Informatica 9.1(PC & IDQ), Oracle, TOAD, SQL Developer, UNIX, CITRIX.
Responsibilities:
- Co-coordinating the onsite and off-shore teams on a daily basis.
- Profiled the data using Informatica Data Quality (IDQ) and performed Proof of Concept.
- Developing maps and workflows in Informatica to load data into final target tables.
- Designing and developing informatica mappings including Type-I, Type-II, Type-III slowly changing dimensions (SCD).
- Unit Testing the maps & workflows.
- Coding & testing the Informatica Objects & Reusable Objects as per XL BI standards.
- Developed complex mappings/sessions using Informatica Power Center for data loading.
- Worked extensively on Workflow Manager, Workflow Monitor and Worklet Designer to create edit and run workflows, tasks, shell scripts.
- Attend Technical meetings & discussions.
- Maintaining documentation.
Confidential
Informatica Developer
Environment: Informatica Power Center 8.6.1, Teradata, SQL Server, oracle, PL/SQL, SQL Developer, Toad, UNIX.
Responsibilities:
- Used Informatica Power Center for (ETL) extraction, transformation and loading data from heterogeneous source systems into target database.
- Created mappings using Designer and extracted data from various sources, transformed data according to the requirement.
- Designing and developing informatica mappings including Type-I ,Type-II, Type-III slowly changing dimensions(SCD).
- Involved in extracting the data from the Flat Files and Relational databases into staging area.
- Mappings, Sessions, Workflows from Development to Test and then to UAT environment.
- Developed Informatica Mappings and Reusable Transformations to facilitate timely Loading of Data of a star schema.
- Developed the Informatica Mappings by usage of Aggregator, SQL overrides usage in Lookups, source filter usage in Source qualifiers, and data flow management into multiple targets using Router.
- Created Sessions and extracted data from various sources, transformed data according to the requirement and loading into data warehouse.
- Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Router and Aggregator to create robust mappings in the Informatica Power Center Designer.
- Developed several reusable transformations and mapplets that were used in other mappings.
- Prepared Technical Design documents and Test cases.
- Involved in Unit Testing and Resolution of various Bottlenecks came across.
- Implemented various Performance Tuning techniques.
Confidential
Informatica Developer
Environment: Informatica Powercenter8.1, Teradata, Oracle, TOAD, SQL Developer, UNIX, CITRIX.
Responsibilities:
- Interacted with business analysts and translate business requirements into technical specifications.
- Using Informatica Designer, developed mappings, which populated the data into the target.
- Responsibilities included designing and developing complex informatica mappings including Type-II slowly changing dimensions.
- Worked extensively on Workflow Manager, Workflow Monitor and Worklet Designer to create edit and run workflows, tasks, shell scripts.
- Used various transformations like Stored Procedure, Connected and Unconnected lookups, Update Strategy, Filter transformation, Joiner transformations to implement complex business logic.
- Developed complex mappings/sessions using Informatica Power Center for data loading.
- Extensively used aggregators, lookup, update strategy, router and joiner transformations.
- Involved in the design, development and testing of the PL/SQL stored procedures, packages for the ETL processes.
- Developed UNIX Shell scripts to automate repetitive database processes and maintained shell scripts for data conversion.
Confidential
Informatica Developer
Environment: Informatica Powercenter7.1, Teradata, Oracle, TOAD, SQL Developer, UNIX, CITRIX.
Responsibilities:
- Involved in creating Detail design documentation to describe program development, logic, coding, testing, changes and corrections.
- Extensively involved in writing ETL Specifications for Development and conversion projects.
- Created shortcuts for reusable source/target definitions, Reusable Transformations, mapplets in Shared folder.
- Involved in requirement definition and analysis in support of Data Warehouse.
- Worked extensively on different types of transformations like Source qualifier, expression, Aggregator, Router, filter, update strategy, lookup, sorter, Normalizer, sequence generator, etc.
- Defined and worked with mapping parameters and variables.
- Performed the performance evaluation of the ETL for full load cycle.
- Checked Sessions and error logs to troubleshoot problems and also used debugger for complex.
- Create PL/SQL objects stored procedures, views, materialized views, triggers, functions.
- Worked on Parameterize of all variables, connections at all levels in UNIX.
- Created test cases for unit testing and functional testing.
- Coordinated with testing team to make testing team understand Business and transformation rules being used throughout ETL process.