- 7+ Years of IT experience in data analysis, design, development, automation and testing of Data Warehouses & DataMart's, using ETL processes, Python, SQL, PL/SQL and Shell scripting and other middleware tools and technologies.
- Hands - on experience across all stages of Software Development Life Cycle (SDLC). Proficiency in data analysis, business/system requirements, data mapping, unit testing, systems integration and user acceptance testing.
- Extensive experience in Extraction, Transformation and Loading (ETL) of data from multiple sources into Data Warehouse and Data Mart. Well versed with Star-Schema & Snowflake schemas used in relational, dimensional modeling.
- Strong knowledge of Hadoop, HBase, Pig, Hive, Hadoop, HDFS, Big Data.
- Involved in converting HIVE/SQL queries into Spark transformations using Spark RDDs, Scala.
- Strong expertise in using ETL Tool Informatica PowerCenter 9.6 and IDQ (Designer, Workflow Manager, Repository Manager, Data Quality (IDQ), Power Exchange and ETL concepts.
- Strong knowledge in data modelling, effort estimation, ETL Design, development, system testing, implementation and production support. Experience in resolving on-going maintenance issues and bug fixes.
- Hands on experience working with various databases including DB2, Oracle, Teradata, and SQL Server.
- Experience in writing UNIX shell, Python and Perl scripts to implement business logic and to load/extract the data from tables.
- Proficient in the integration of various data sources like Oracle, flat files, XML, MS SQL Server and Teradata into the staging area, ODS, Data Warehouse and Data Mart.
- Expertise in database programming SQL, PL/SQL, Teradata Fast Load, Fast Export, Multi Load, DB2, SQL Server.
- Experience in designing of QVDs and generate reports, error dashboards using Qlikview reporting tool.
- Integrated Kafka and Spark Streaming for our solution and prepared Data Frame on the streaming data.
- Expertise in Tableau BI reporting tools & Tableau Dashboards Developments & Server Administration.
- Encoded and decoded Json objects using PySpark to create and modify the dataframes in Apache Spark.
- Experienced with work flow schedulers, data architecture including data ingestion pipeline design and data modelling.
- Coordinating with Business Users, functional Design team and testing team during the different phases of project development and resolving the issues.
- Developed Apache Pig Scripts to perform Extract-transform-load data pipelines and integrated the data and Processed terabytes of online advertising using Hive Query Language.
Programming & Scripting Languages: C, R, HTML, Python, Hadoop, XML, Hive SQL
Database: SQL, My SQL, MS Access, Oracle 11g/10g, Mainframe Teradata
Reporting & Scheduling Tools: SPRS, Crystal Reports, Business Objects, Cognos, Autosys
Web Packages: Google Analytics, Adobe Test & Target, Web Trends
Development Tools: R Studio, Notepad ++
Visualization Tools: Tableau, Matplotlib
Packages: Dplyr, rjson, GG PLOT2
Techniques: Machine learning, Regression, Clustering, Data mining.
ETL Tools: Informatica Power Center 9.6.1/9.5.1/9.1.1/8.6.1/8.1.
Business Analysis: Requirements Engineering, Business Process Modeling & Improvement, Financial Modeling
Operating Systems: Microsoft windows 7/8/8 .1/10/Vista/ XP, Linux (Ubuntu),Unix
SR. INFORMATICA ETL DEVELOPER
Confidential - RICHMOND, VA
- Development of Business Requirements Design documents.
- Extracted the raw data from Microsoft Dynamics CRM to staging tables using Informatica Cloud.
- Extensively worked with Repository Manager, Designer, Workflow Manager and Workflow Monitor.
- Developed transformation logic and designed various Complex Mappings and Mapplets using the Designer.
- Developed complex mapping to implement slowly changing Dimension (SDC).
- Worked with the Lookup, Aggregator, Expression, Router, Filter, Update Strategy, Joiner Transformations.
- Developed various worklets that were then included into the workflows. Developed Tableau dashboards by extracting data from different databases.
- Used Workflow Manager to read data from sources, and write data to target databases, and manage sessions.
- Working on generating various dashboards in TableauServer using different data sources such as Teradata, Oracle, Microsoft SQL Server and Microsoft Analysis Services.
- Developed complex mappings to implement type 2 slowly changing dimensions using transformations such as the Source qualifier, Aggregator, Expression, Static Lookup, Dynamic Lookup, Filter, Router, Rank, Union, Normalize, Sequence Generator, Update Strategy and Joiner.
- Involved in designed tables and implementing Informatica mappings and workflows for extraction of the data from the source systems to populate Staging Area, Dimension and Fact Tables.
- Performed tuning of Informatica sessions by implementing database partitioning, increasing block size, data cache size, sequence buffer length, target based commit interval.
- Migrating code from different Environments.
- Execution of queries in validating the Source and Target data tables.
- Understanding the Business Requirements and Functional Requirements specification.
- Defects tracking & Analyzing Test Results using test management tool, HP Quality Center.
- Being end to end expertise in functionality, giving KT to new resources.
- Actively participated in DRB meetings.
- Participating in Status Calls with Onsite Clients and QA resources.
- Mentored and coordinated 13 member's team during testing activities for multiple releases.
- Analyzed on GIS systems to load the dimension tables.
- Tested all business functionalities and mapped the requirements in QC - analysis of its coverage.
- Assisted in batch processing and verified the jobs status and data in database tables.
- Collected and verified the necessary data for batch run across team and reported the same.
- Responsible for defect tracking, retesting and closing of the defects as per defect life cycle. Analyzed and developed software test strategies and executed the test methods; respective SQL scripts to validate the data. Peer reviewed results.
- Provided XML output files to downstream applications by processing the Retail and Lease contracts from HOST and Carlos applications and Load to STAGE and HUB tables and then to the EDW and Finally to Dimension and rollup/aggregate the data by the business grains into the FACT tables.
- Informatica Data Quality (IDQ 8.6.1) is the tool used here for data quality measurement.
- Created TIDAL jobs and schedules based on demand, run on time, run only once and Ad-Hoc.
- Troubleshoot failures, worked with development team to point and resolve problems.
- Validation of transformation as per business rules performing peer reviews and walkthrough Coded small macros in excel to run the daily testing reports efficiently.
Environment: Informatica 8.5, UNIX, SQL, Oracle, HP ALM 11, Citrix, Putty, TOAD 10.6.
SR. ETL INFORMATICA DEVELOPER
Confidential - AUSTIN, TX
- Created technical design specification documents for Extraction, Transformation and Loading Based on the business requirements.
- I worked for Development, Enhancement & Supporting the Enterprise Data Warehouse (EDW).
- Extracted the provider records based on given requirement and sending the error messages for the records excluded from state file.
- Wrote SQL overrides and used filter conditions in source qualifier thereby improving the performance of the mapping.
- Designed and developed mappings using Source Qualifier, Expression, Lookup, Router, Aggregator, Filter, Sequence Generator, Stored Procedure, Update Strategy, joiner and Rank transformations.
- Developed data conversion, quality, cleansing rules and executed data cleansing activities such as data Consolidation, and standardization for the unstructured flat file data.
- Extensively used SQL Scripts and worked in Windows Environment.
- Created a new table that holds process details and auto generate Batch ID to populate the data in the extract, exclusion tables.
- Created data breakpoints and error breakpoints for debugging the mappings.
- Involved in Analyzing/ building Teradata EDW using Teradata ETL utilities and Informatica.
- Exported, imported the mappings, sessions, worklets, and workflows from development to Test Repository and promoted to Production.
- Used Session parameters, Mapping variables, parameters and created Parameter files for imparting flexible runs of workflows based on changing variable values.
- Extracted data from Oracle and SQL Server then used Teradata for data warehousing.
- Stored data from SQL Server database into Hadoop clusters which are set up in AWS EMR.
- Monitored sessions using the workflow monitor, which were scheduled, running, completed or failed debugged mappings for failed sessions.
- Extensively used ETL to load data from different relational databases, XML and flat files.
- Worked with re-usable sessions, decision task, control task and Email tasks for on success, on failure mails.
- Developed MLOAD scripts to load data from Load Ready Files to Teradata Enterprise Dataware house(EDW).
- Deploying and scheduling Reports using SSRS to generate all daily, weekly, monthly and quarterly Reports including current status.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
- Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift. Have used AWS components (Amazon Web Services) - Downloading and uploading data files(with ETL) to AWS system using S3 components.
- Strong expertise in designing and developing Business Intelligence solutions in staging, populating Operational Data Store (ODS), Enterprise Data Warehouse (EDW), Data Marts / Decision Support Systems using Informatica Power Center 9.x/8.x/7.x/6.x ETL tool.
Environment: Informatica Power Center 10/9.6, EBS, Informatica BDE, Hive 2.7, HL7, Teradata 12, SSRS, Enterprise Datawarehouse(EDW), Oracle 11/10g, PL/SQL, Jitterbiy, Perl Scripting, SSAS, Autosys, TOAD 9.x,Oracle Financials, Shell Scripting, python, Spark, Dynamic SQL, Oracle SQL *Loader, SSIS 2008 and Sun Solaris UNIX, OBIEE, Windows-XP, XML.
ETL/ INFORMATICA DEVELOPER
Confidential - CLEVELAND, OH
- Involved in gathering, analyzing, and documenting business requirements, functional requirement for database changes, ETL, Business Reports.
- Responsible for architecting, designing, implementing and supporting of ETL and Cloud based infrastructure and its solutions.
- Extracted Data from different source systems like Oracle, MySQL and Flat Files
- Extensively used various Data Cleansing and Data Conversion functions like LTRIM, RTRIM, TO DATE, Decode, and IIF functions in Expression Transformation.
- Worked with various transformations in Informatica power centre like Filter, Aggregator, Joiner, Rank, Router, Sorter, Source qualifier, and Update Strategy transformations.
- Extensively worked with Joiner functions like normal join, full outer join, master outer join and detail outer join in the Joiner transformation.
- Worked with PL/SQL to create complex queries, function, store procedures, trigger etc.
- Developed Slowly Changing Dimension Mappings for Type 1 SCD and Type 2 SCD.
- Responsible for best practices like naming conventions, Performance tuning, and Error Handling
- Responsible for Performance Tuning at the Source level, Target level, Mapping Level and Session Level.
- Performance tuning of SQL scripts using Oracle Explain Plan.
- Responsible for Unit testing and Integration testing of mappings and workflows.
- Reviewed and tested Informatica ETL code and wrote test plans.
- Responsible for 24/7 Informatica production support.
- Involved in various meetings, seminars, presentations and group discussions.
Environment: Informatica Power Center 8.6.1, Oracle 11g, MySQL, SQL, Pl/SQL, TOAD, Erwin, SQL SERVER, SSIS, Shell Scripts, Unix, Windows.
JR. ETL DEVELOPER
- Involved to understand the Business requirements with Business Analysts and stakeholders.
- Involved to design the blue print and approach Documents for high level development and design of the ETL applications.
- Developed complex ETL mappings to load dimension and fact tables.
- Worked with type 1 and type 2 dimensional mappings to load data from source to target.
- Designed blueprint and approach documents for each story and take approvals from the business analysts to design the ETL mappings.
- Developed mappings, sessions, workflows and run the jobs.
- To develop the mappings used all transformations like Source Qualifier, Normalizer, Expression, Filter, Router, Update strategy, Sorter, Lookup, Aggregator, Joiner and Sequence Generator in the mapping.
- Used reusable transformation from shared folder to implement the same logic in different mappings.
- Involved with testing team to resolve the issues while testing ETL applications.
- Resolved the testing issues and conducted meeting with testing team to understand the business requirements and how to test the applications based on business and technical prospective.
- Involved in moving the ETL jobs from development to test environment.
- Involved in extensive performance tuning by determining bottlenecks using Debugger at various points like targets, sources, mappings, sessions level
- Participated in weekly status meetings and conducting internal and external reviews as well as formal walk through among various teams and documenting the proceedings.
- Experience in PL/SQL programming that included writing Stored Procedures, Packages, and Functions
Environment: Informatica Power Centre 7.1, FTP, windows XP, Oracle 9i, DB2, PL/SQL, SQL Server, Flat files, UNIX, Shell Scripting, Waterfall Methodology.
- Wrote queries using T-SQL to create joins, sub queries and correlated sub queries to retrieve data from the database.
- Identified, tested, and resolved database performance issues (monitoring and tuning) to ensure database optimization.
- Created/Updated database objects like tables, views, stored procedures, function, packages.
- Involved in designing, developing and testing of the ETL (Extract, Transformation and Load) strategy to populate the insurance data from various source systems feeds using SSIS.
- Created views to facilitate easy user interface implementation, and triggers on them to facilitate consistent data entry into the database.
- Rigorously tested and debugged the Stored Procedures and used Triggers to test the validity of the data after the insert, update or delete.
- Involved in table and index partitioning for performance and manageability.
- Monitored the overall performance of the database to recommend and initiate actions to improve/optimize Performance.
- Filtered data from Transient Stage to EDW by using Complex T-SQL statements in Execute SQL Query Task and in Transformations and implemented various Constraints and Triggers for data consistency and to preserve data integrity.
- Used SQL Server Profiler to trace the slow running queries and the server activity
- Automated and enhanced daily administrative tasks including database backup and recovery.
- Created different views using SSRS that were published the business stake holders for analysis an customization using filters and actions.
- Created Bar Charts which is complied with data sets and added trend lines and forecasting on future projections.
- Helped create process logging and new monitoring tools, integrity reports, and mapping tools.
- Involved in setting up SQL Server Agent Jobs for periodic Backups with backup devices, database maintenance plans and recovery.
Environment: MS SQL Server 2012, SSIS, SSAS, SSRS Oracle11g, Visual Studio 2008, VSS 2005, Erwin 7.3, Windows NT/2003, Microsoft .NET framework 3.5, C#, ASP.NET, XML.
- Collaborated in JAD sessions with project managers, developers, QA Team, design architects, and design modelers to transformed outlined business requirements and specifications into functional / non-functional requirements.
- Performed data profiling, cleansing, validation, and verification with SSIS tasks and SQL stored procedures.
- Implemented constraints to maintain referential, domain, and column integrities.
- Wrote user defined functions to provide custom functionalities in T-SQL.
- Improved existing procedures, triggers, UDF, and views with execution plans, SQL profiler, and DTA.
- Performed various transformations such as conditional split and sorting; included data about the environment with the use of the audit transformation function.
- Implemented multiple SSIS features such as loggings, transactions, checkpoints, deployment, and configuration on a designed end-to-end ETL strategy to ensure the developed packages can be optimized to the fullest depth.
- Produced reporting data models which are used by the report builder ad-hoc reporting tool within reporting services.
Environment: MS SQL Server 2008, TSQL, Erwin, SQL Profiler, DTA, TFS, SSIS, SSMS, BIDS