Sr. Data Engineer Resume
Reston, VA
SUMMARY
- Around 7 years of IT experience with extensive experience in Data warehousing, Data Analysis, Reporting, ETL, Data Modeling, Development, Maintenance, Testing and Documentation.
- Expertise in development and design of ETL methodology for supporting data transformations and processing, in a corporate wide ETL Solution using Informatica Power Center, IDQ, MDM, PWX, Apache Airflow, AWS DMS and Amazon S3.
- Extensively worked on Informatica Data Quality, PowerExchange, Informatica Power center throughout complete Data Quality and MDM projects.
- Very strong in implementation of data profiling, creating score cards, creating reference tables and documenting Data Quality metrics/dimensions like Accuracy, completeness, duplication, validity, consistency.
- Very strong knowledge of Informatica Data Quality transformations like Address validator, Parser, Labeler, Match, Exception, Association, Standardizer and other significant transformations
- Experience in Performance Tuning of Sources, Targets, mappings and using Push down Optimization, different Session Partitioning techniques like Round robin, Hash - Key, Range & Pass Through.
- Extensive experience in RDBMS, Data Warehouse Architecture, Inmon and Ralph Kimball Technologies and thorough understanding and experience in data warehouse and data mart design.
- Well understanding of Data Models (Dimensional & Transactional), Conceptual/Logical & Physical Data Models, DWH Concepts, ER diagrams, Data Flow Diagrams/Process Diagrams.
- Great knowledge of Software Development Life Cycle (SDLC) with industry standard methodologies like Waterfall and Agile including Requirement analysis, Design, Development, Testing, Support and Implementation.
- Experience in implementation of the Informatica Web Services Hub setup and event based (SOAP request/response) triggering of workflows imported through WSDL.
- Expertise in using Power Exchange Express CDC with logger to process data in real time as per the business requirement.
- Experience in developing PWXPC mapping using Power Exchange Navigator and Power Center Designer as per the business needs.
- Experience with PWX utility tools such as DTLURDMO, DTLUAPPL and DTLUCBRG; and great knowledge on Oracle Archive logs, Re-do logs and configuring PWX logger & listener services.
- Knowledge in designing and developing Data marts, Data warehouse using multi-dimensional Models such as Snowflake Schema and Star Schema using Facts& Dimension tables.
- Expertise in UNIX Shell scripting for high volume data warehouse instances
- Experience in Production Support in addressing ETL Process, Database (pl/sql), Linux & shell scripting issues.
- Adeptly rendered services on PostgreSQL, Redshift, DB2, SQL, PL/SQL, Oracle 11g/10g/9i/8i, Teradata, SQL Server.
- Expertise in utilizing Oracle utility tool SQL Developer and expertise in Toad for developing Oracle applications.
- Involved in setting up the standards for Architecture, design and development of database applications
- Expertise in Database development skills using SQL, PL/SQL, T-SQL, Stored Procedures, Functions, Views, Triggers and complex SQL queries. Proficient using TOAD, SQL Developer for system testing of the reports.
- Worked with business SMEs on developing the business rules for cleansing. Applied business rules using Informatica Data Quality (IDQ) tool to cleanse data.
- Presented Data Cleansing Results and IDQ plans results to the OpCo’s and SME’s.
- Very strong knowledge on end-to-end process of Data Quality and MDM requirements and its implementation.
- Worked in production support of ETL and BI applications for large Life Sciences & Healthcare Data warehouses for monitoring, troubleshooting, resolving issues.
- Has experience in reviewing Test plans, Test cases and Test case execution. Understanding business requirement documents and functional specs and then writing test cases using Quality Center. Also played an active role in User Acceptance Testing (UAT) and unit, system & integration testing.
- Self-motivated team player, quick learner with strong problem-solving skills and excellent in communication.
TECHNICAL SKILLS
Tools: Informatica PC 10.x/9.x/8.x, Informatica Power Exchange 10.x/9.x, Informatica Data Quality 9.x, Informatica Analyst tool 9.x, Apache Airflow, AWS DMS and Amazon S3.
Languages: XML, UML, HTML, C, C++, Python, UNIX, Shell Scripting, SQL, PL/SQL, T-SQL.
Database: PostgreSQL, Redshift, Oracle 12c/11g/10g/9i/8i/, SQL Server, IBM DB2, MS Access, Teradata.
Operating System: Windows 98/NT/2000/2003/XP/Vista/7/10, Sun Solaris 5.8/5.6, HP-UX, DOS, Red Hat Linux, SOAP, UnixAIX5.3/4.3.
Reporting & Data Modeling Tools: Business Objects, Cognos, Erwin Data Modeler, MS Visio, SAS Viya.
Other Tools: WinSCP, DBeaver, PyCharm, TOAD, SQL Loader, SQL Plus, Query Analyzer, Putty, DTLURDMO, DTLUAPPL, DTLUCBRG, Tortoise SVN, GitHub, Password Safe, MS Office (MS Access, MS Excel, MS PowerPoint, MS Word, MS Visio).
PROFESSIONAL EXPERIENCE
Confidential, Reston, VA
Sr. Data Engineer
Responsibilities:
- Worked on providing analysis, estimations, design solutions and implementation plan on par with requirement for transitioning the project into open source by using Apache Airflow, PostgreSQL, Redshift, Amazon S3 and AWS DMS.
- Maintain and support existing ETL code in PowerCenter, PowerExchange and PLSQL stored procedures, and make enhancements to the code as needed.
- Design and development of ETL methodology using Apache Airflow and create DAG’s (Directed acyclic graph) to move data from various source systems to the target system.
- Created DAG’s to load bulk and incremental data, schedule them hourly/daily/ weekly/ monthly/ quarterly as per the requirement and used triggers that would activate the DAG when the specified condition is met or when new data comes in.
- Created functions, views, template views, tables, synonyms, indexes, sequences and DB-links in PostgreSQL and Redshift.
- Migrated large data sets using AWS DMS, Amazon S3 and DB-links between PostgreSQL, Redshift and Oracle.
- Enhance the existing ETL processes by tuning the DAG’s and Informatica mappings for better performance through PLSQL and PostgreSQL Procedures/Functions, push down Optimization and by using different Session Partitioning techniques like Round robin, Hash-Key, Range & Pass Through.
- Used SAS Viya to generate reports for end users. Created reusable code snippets, fixed width extracts, graphs and scheduled them as needed.
- Performed unit testing and end-to-end testing. Developed numerous ETL and reporting test cases, test plans, scripts, procedures and documented the results. Validated the test results against expected results. Prepared the test data and loaded it for testing, error handling and analysis. Performed peer review for fellow team member’s code and suggested changes as required.
- Deployed DAG’s, database elements and SAS Viya code to higher environments. Created dtlurdom.ini files to migrate PWX registration and extraction components. Used PowerCenter Manager and DTLURDMO utility tool to deploy PowerCenter and PowerExchange code to higher environments.
- Created ETL check jobs and monitoring systems that would automatically restart the jobs in the event of failure and send emails to the team if the failure persists. Also created a process that would check the status of data loads every morning and would send email notifications to specific end users in an event a certain data group is not available for reporting due to failures or any other issues.
- Ensure integrity, availability and performance of application systems by providing technical support, maintenance of database security and disaster recovery procedures. Performed troubleshooting and resolved many application and database issues in accurate and timely fashion.
Environment: /Tools: Apache Airflow, Informatica Power center 10.2 Hf2, Oracle 19c, Redshift, Postgres, SAS Viya, Linux, Command Prompt, GitHub, JIRA, Password Safe, WinSCP, DBeaver and PyCharm.
Confidential, West Des Moines, IA
ETL Lead
Responsibilities:
- As an ETL lead, worked on multiple projects and coordinated work with ETL developers both onsite and offshore.
- Gather, Analyze the requirement, and converting Functional Requirements into Technical Specifications.
- Design of ETL methodology using Informatica PowerCenter to accomplish data transformation and load processes.
- Create multiple PowerCenter mapplets, mappings and sessions using various transformations like Lookup, Filter, Normalizer, Joiner, Aggregator, Expression, Router, Update strategy, Sequence generator, XML Generator Transformations and merged them into workflows. Logically planned and designed the inter-linking of these workflows with other ETL processes to make various parts of the application work together as a single system.
- Design the Extract Transform and Load processes for data migration using Informatica to load data from Flat files/ Excel to source staging database and from staging to target SQL server Data Warehouse.
- Development of SQL server scripts and use of various database objects like Stored Procedures, Functions, Views, Triggers, Tables, synonyms and DB-links
- Enhance the ETL processes by tuning the mappings for better performance with SQL Server Procedures/Functions, push down Optimization and also by using different Informatica Session Partitioning techniques like Round robin, Hash-Key, Range & Pass Through.
- Collaborates with Data Modeler and other ETL developers to create optimum software and to modify the code as per the changes in the business requirements throughout the Software Development Life Cycle.
- Schedules the application to run daily using Control-M and automate the processes using Command Task and UNIX Shell scripts which ensures that the application continues to function normally through software maintenance and testing.
- Create a reliable recovery process to protect the system in the event of a failure and also developed relevant back-up applications to run during such failures.
- Create numerous test cases, test plans and performed the unit test and documented the results. Bugs found during testing were analyzed (root cause analysis) beyond their obvious reason to extrapolate various errors that can occur in future.
- Document ETL test plans, test cases, test scripts, test procedures, assumptions and validations based on design specifications for unit testing, expected results, preparing test data and loading for testing, error handling and analysis.
- Deployed PowerCenter code to higher environments using Informatica Deployment groups.
- Documents every design, development, test cases and deployment including the Mapping and Transformation Rules, Source and Target definitions as a reference for future maintenance and upgrades.
Environment: /Tools: Informatica Power center 9.6.1 Hf4/10.2 Hf2, SQL Server 2012, UNIX, Command Prompt, SharePoint, Mavenlink, RedHat Linux, JIRA, Control-M, Storyboard.
Confidential, West Des Moines, IA
Sr. ETL Informatica/ PWX Developer
Responsibilities:
- Involved in business requirements feasibility analysis, providing estimations, providing design and solution specifications, software development and implementation plan and production support.
- Designed ETL methodology using Informatica PowerCenter and PowerExchange to accomplish data transformation and load processes.
- Created multiple PowerCenter mapplets, mappings and sessions using various transformations like Lookup, Filter, Normalizer, Joiner, Aggregator, Expression, Router, Update strategy, Sequence generator, XML Generator Transformations and merged them into workflows. Logically planned and designed the inter-linking of these workflows with other ETL processes to make various parts of the application work together as a single system.
- Designed the ETL processes to migrate data from OLTP systems to a staging area and from there to a Data Warehouse (OLAP system).
- Used Power Exchange Navigator to register relational tables with Change Data Capture to capture changes for real time data integration; perform database row test and generate current restart tokens.
- Recommended and implemented software upgrades to process data in real-time mode by using PWX CDC (PowerExchange Change Data Capture) rather than in batch processing mode to achieve optimum load times.
- Created PWXPC mappings by using Power Center Designer and used application connection in Power center Workflow Manager to run these PWXPC mappings continuously which would flush the DML changes instantly from source to target.
- Created a reliable recovery process that would store a backup of every job that runs and in the event of a failure the system uses this backup to restart from that point of failure for that particular job, also created a process that would send an automated e-mail notifications to the team in such events of failure.
- Used advanced GMD recovery process for PWXPC mappings which would automatically resynchronize the target tables in the event of a failure.
- Configured dbmover.cfg and pwxccl.cfg files for PWX logger and listener and also on the Informatica Power Center server side to achieve real-time data processing and integration.
- Created a variety of models, Proof of concepts with relevant pieces of code and screen shots to demonstrate other developers and admins on how various parts of the application are designed and works and upload the same to SVN for reference.
- Enhanced the ETL processes by tuning the mappings for better performance. Tuned the performance of Informatica session for large loads by using Push down Optimization, session Partitioning techniques like Round robin, Hash-Key, Range & Pass Through; and by adjusting block size, data cache size, sequence buffer length and target based commit interval.
- Collaborated with Data Modeler and other ETL developers to create optimum software and to modify the code as per the changes in the business requirements throughout the Software Development Life Cycle.
- Used Informatica and PL/SQL schedulers to schedule jobs that would automatically run on weekly and daily basis depending upon the business model which ensures the availability of application to end users.
- Used SQL Developer, PL/SQL tools to create and modify ETL packages, Query performance tuning, Create DML’s, create database objects like Tables, Views, Indexes, Synonyms, stored procedures and Sequences.
- Created numerous ETL test cases, test plans, scripts and procedures; and performed the unit test and documented the results. Bugs found during testing were analyzed (root cause analysis) beyond their obvious reason to extrapolate various errors that can occur in future.
- Created deployment documents for deploying code to TEST and PROD environments; and deployed PowerCenter and PowerExchange code to higher environments using Informatica PowerCenter Manager and DTLURDMO utility tool.
- Played an active role in PowerCenter and PowerExchange upgrade from 9.6.1 to 10.1.1 and later to 10.2 to support the Oracle upgrade from 12.1 to 12.2 and coordinated with QA team to test various elements such as connectivity, performance and data consistency.
- Documented every design, development, upgrade, test case and deployment including the Mapping and Transformation Rules, Source and Target definitions as a reference for future maintenance and upgrades.
Environment: /Tools: Informatica Power center 9.6.1 Hf4/10.1.1 Hf1/10.2 Hf1, Informatica Power Exchange 9.6/9.6.1/10.1.1/10.2 , Oracle 12.1/12.2,PowerExchange Navigator, Power Center Client, DTLURDMO Utility, DTLUAPPL Utility, DTLUCBRG Utility, UNIX, Command Prompt, TortoiseSVN, Password Safe, SQL developer, PL/SQL, RedHat Linux, Solaris, JIRA.