Data Quality Engineer Resume
Spring, TX
SUMMARY
- Overall 12 years of IT experience with 9 years of experience in Product Owner, System Analysis, Data Analysis in various IT products within Heath care systems.
- Analysis done using IT environments such Teradata SQL, SQL Server, PL/SQL, Talend Development, Maintenance, Implementation, Support and Administration.
- Having experience in teh field of Health care applications such as EOB analysis, Claims Analysis, Clinical data analysis in Facets, MedHok, QNXT, GTE, CCMS, Ultraportdatabases..
- Extensive experience in development and maintenance in a corporate wide ETL solution using SQL, PL/SQL, TALEND 4.x/5.x/6.xonUNIXandWindowsplatforms.
- Strong experience with Talend tools - Data integration, Big data and experience inData Mapper, Joblets, Meta data and Talend components.
- Extensive experience in integration of various heterogeneous data sources definitions likeSQL Server, Oracle, Flat Files, Excel files loaded data in to Data warehouse and Data marts usingTalendStudio.
- Experience on databases like MySQL and Oracle using RDS of AWS.
- Experienced in ETL TALEND Data Fabric components and used features ofContext Variables, MySQL, Oracle and Hive Database components.
- Experience in scheduling tools Autosys, Control M & Job Conductor(Talend Admin Console), kafka topic, MuleSoft.
- Experience withBig Data, Hadoop, HDFS, Map Reduce and Hadoop Ecosystem (Pig & Hive)technologies.
- Experience with Worked on analyzingHadoop clusterand different big data analytical and processing tools includingPig, Hive,Spark, and Spark Streaming.
- Extensively created mappings in TALEND usingtMap, tJoin, tReplicate, tConvertType, tFlowMeter, tLogCatcher, tNormalize, tDenormalize, tJava, tAggregateRow, tWarn, tLogCatcher, tMysqlScd, tFilter, tGlobalmap and tDieetc.
- Excellent Experience onNOSQLdatabases likeHBase and Cassandra.
- Excellent understanding ofHadoop architecture, Hadoop Distributed File System, and API's.
- Extensive noledge of business process and functioning ofHeath Care, Manufacturing, Mortgage, Financial, Retail, and Insurancesectors.
- Strong skills in SQL and PL/SQL, backend programming,creating database objects like Stored Procedures, Functions, Cursors, Triggers, and Packages.
- Experience in AWS S3, EC2, SNS, SQS setup, Lambda, RDS (MySQL) and Redshift cluster configuration.
- Experienced inWaterfall, Agile/ScrumDevelopment.
- Good noledge in implementing various data processing techniques usingPig and MapReducefor handling teh data and formatting it as required.
- Extensively used ETL methodology for performingData Migration, Data Profiling, Extraction, Transformation and Loadingusing Talend and designed data conversions from wide variety of source systems like SQL Server, Oracle, DB2 and non-relational sources like XML, flat files, and Mainframe Files.
- Well versed in developing various database objects likepackages, stored procedures, functions, triggers, tables, indexes, constraints, views in Oracle11g/10g.
- Hands on Experience in running Hadoop streaming jobs to process terabytes of xml format data usingFlume and Kafka.
- Worked in designing and developing teh Logical and physical model using Data modeling tool (ERWIN).
- Experienced inCode Migration, Version control, scheduling tools, Auditing, shared folders and Data Cleansingin various ETL tools.
- Good communication and interpersonal skills, ability to learn quickly, with good analytical reasoning and adaptive to new and challenging technological environment.
- Strong Team working spirit, relationship management and presentation skills.
- Expertise in Client-Server application development usingMS SQL Server … Oracle … PL/SQL, SQL *PLUS, TOAD and SQL*LOADER. Worked with various source systems such as Relational Sources, Flat files, XML, Mainframe COBOL and VSAM files, SAP Sources/Targetsetc.
- Work hands-on with integration processes for theEnterprise Data Warehouse(EDW).
- Knowledge in writing, testing and implementation of teh Stored Procedures, Functions and triggers usingOracle PL/SQL & T-SQL, Teradata data warehouse using BTEQ, COMPRESSION techniques, FASTEXPORT, MULTI LOAD, TPump and FASTLOAD scripts.
TECHNICAL SKILLS
ETL/Middleware tools: Talend 5.5/5.6/6.2, Informatica Power Center 9.5.1/9.1.1/8.6.1/7.1.1
Scripting: JavaScript, Jelly Scripting, HTML, CSS, XML, DHTML
Operating Systems: Windows 2008,2003,2000, XP, 7,Unix/Linux
Languages: C, C++, SQL, PL/SQL, JAVA/J2EE, Lotus Script
Databases: Oracle 8i/9i,10g, SQL server 2008/2005/2002 , DB2, Teradata 14/13
Data Modeling: Dimensional Data Modeling, Star Join Schema Modeling, Snow-Flake Modeling, Fact and Dimension tables, Physical and Logical Data Modeling
Business Intelligence Tools: Business Objects 6.0, Cognos 8BI/7.0, Sybase, OBIEE 11g/10.1.3.x
RDBMS: Oracle.11g/10g/9i, Teradata, MS SQL Server 2014/2008/2005/2000 , DB2, MySQL, MS Access.
Tools: TOAD, SQL Plus, SQL*Loader, Quality Assurance, Soap UI, Fish eye, Subversion, Share Point, IP switch user, Teradata SQL Assistant.
PROFESSIONAL EXPERIENCE
Confidential
Data Quality Engineer
Responsibilities:
- Worked as a liaison with source SME’s and Product Owners to understand teh current core systems based on teh data flow.
- Responsible to work with ETL Development team by providing walkthroughs on each FHIR resources (US Core Clinical, CARINBB EOB) mapping documents such as gap analysis and STTM mapping documents.
- Worked on US core clinical profiles of US core Goal, Condition, Encounter, Observation Lab, Pulse Oximetry, Diagnostic Report, Diagnostic Note, Care Plan, Care Team, Practitioner, Organization, Location, Immunization, Allergy Intolerance etc.
- Responsible to work with test use cases, data gaps, business test scenarios and test interoperability.
- Supported development teams with possible data quality gaps when comparing FHIR JSON model.
- Extensively leveraged teh Teradata, SQL Server, andInformaticaenvironments.
- Extensively collaborated with programmers, Engineering Architects and Organization leaders to identify teh opportunities for process improvements under guidance of Data Governance.
Confidential, Spring, TX
Data Modeling Analyst
Responsibilities:
- Responsible to conduct scrum meetings, gatherrequirements, data discovery, data Analysis, data Profiling, creating STTMs, conducting JAD sessionsand interpreting data from various source systems.
- Worked as a liaison with source SME’s and Product Owners for QNXT, MedHok, FACETS, membership, CCW, CCMS, Arcadia and GTE to understand teh current core systems based on teh data flow.
- Responsible to work with ETL Development team by providing walkthroughs on each FHIR resources (US Core Clinical, CARINBB EOB) mapping document such as gap analysis, HI2 mapping and STTM mapping documents.
- I have worked on US core clinical profiles of US core Goal, Condition, Encounter, Observation Lab, Pulse Oximetry, Diagnostic Report, Diagnostic Note, Care Plan, Care Team, Practitioner, Organization, Location, Immunization, Allergy Intolerance etc.
- I have provided extensive support to Development Team.
- I have provided support to teh AWS Engineering team in clarifyingFHIR follow up questions.
- Extensively used analytical tools to interpret teh data sets and mining data from primary and secondary sources and paying particular attention to teh patterns of EOB, US Core Clinical using Teradata environments.
- Demonstrated teh significance in teh context of local trends that impacts both org and industry.
- Designed and customized data model support for Datawarehouse supporting data from multiple sources in real time.
- Involved in building teh Data Ingestion Architecture and Source to Target mapping to load data into Data warehouse.
- Extensively leveraged teh Teradata, SQL Server, andInformaticaenvironments.
- Extensively collaborated with programmers, Engineering Architects and Organization leaders to identify teh opportunities for process improvements under guidance of Data Governance.
- Created appropriate documentation that allows teh stakeholders to understand teh steps of teh data analysis process which helps them to make data driven business decisions and to communicate teh value of teh information effectively.
Confidential, TX
Product Owner
Responsibilities:
- Participated in Requirement gathering, Data Analysis, Business Analysis, Kick off meetings and translating user inputs into ETL mapping documents.
- Responsible to work with membership SME’s, to understand when a new group on boarded, understand data architecture for teh new groups flow, work with business SME’s for EviCore downstream requirements.
- Work with Pre-benefit and Post Benefit for Member eligibility, Claims eligibility and authorization files.
- Designed and customized data models for Data warehouse supporting data from multiple sources on real time
- Involved in building teh Data Ingestion architecture and Source to Target mapping to load data into Data warehouse.
- Extensively leveraged teh Talend Big Data components (tHDFSOutput, tPigmap, tHive, tHDFSCon) for Data Ingestion and Data Curation from several heterogeneous data sources.
- Work with business user queries for eligibility validation, Validate Cross walk tables for member existence.
- Worked with Data mapping team to understand teh source to target mapping rules.
- Prepared both High level and Low-level mapping documents.
- Analyzed teh requirements and framed teh business logic and implemented it using Talend.
- Involved in ETL design and documentation.
- Developed Talend jobs from teh mapping documents and loaded teh data into teh warehouse.
- Involved in end-to-end Testing of Talend jobs.
- Analyzed and performed data integration using Talend open integration suite. Experienced in architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
- Migrating various Hive UDF's and queries into Spark SQL for faster requests.
- Configured Spark Streaming to receive real time data from theApache Kafkaand store teh stream data toConfidential using Scala.
- Hands on experience inSparkandSparkStreaming creating RDD's, applying operations - Transformation and Actions.
- Developed and implemented hive custom UDFs involving date functions.
- Involved in developing Shell scripts to orchestrate execution of all other scripts and move teh data files within and outside ofConfidential.
- Installed and configuredHive, Pig, Sqoop and Oozie on teh Hadoop cluster.
- Using Kafka on publish-subscribe messaging as a distributed commit log with its fast, scalable, and durability.
- Worked on teh design, development and testing of Talend mappings.
- Wrote complex SQL queries to take data from various sources and integrated it with Talend.
- Created ETL job infrastructure using Talend Open Studio.
- Worked on Talend components like tReplace, tmap, tsort and tFilterColumn, tFilterRow etc.
- Used Database components like tMSSQLInput, tOracleOutput etc.
- Worked with various File components like tFileCopy, tFileCompare, tFileExist.
- Developed standards for ETL framework for teh ease of reusing similar logic across teh board.
- Analyzed requirements, create design and deliver documented solutions that adhere to prescribed Agile development methodology and tools
- Developed mappings to extract data from different sources like DB2, XML files loaded into Data Mart.
- Created complex mappings by using different transformations like Filter, Router, lookups, Stored procedure, Joiner, Update Strategy, Expressions and Aggregator transformations to pipeline data to Data Mart.
- Involved in designing Logical/Physical Data Models, reverse engineering for teh entire subject across teh schema.
- Scheduling and Automation of ETL processes with scheduling tool in Autosys and TAC.
- Scheduled teh workflows using Shell script.
- Creating Talend Development Standards. dis document describes teh general guidelines for Talend developers, teh naming conventions to be used in teh Transformations and development and production environment structures.
- Troubleshoot database, Joblets, mappings, source, and target to find out teh bottlenecks and improved teh performance.
- Involved rigorously in Data Cleansing and Data Validation to validate teh corrupted data.
Environment: Talend 6.x, Teradata 16.2, DB2, Unix, Zena, SQL server 2008, SQL, MS Excel, UNIX Shell Scripts, Talend Administrator Console, Cassandra, Oracle, Jira, SVN, Quality Center, and Agile Methodology, TOAD, Autosys.
Confidential, Atlanta, GA
Product Owner
Responsibilities:
- Participated in Requirement gathering, Business Analysis, User meetings and translating user inputs into ETL mapping documents.
- Designed and customized data models for Data warehouse supporting data from multiple sources on real time
- Involved in building teh Data Ingestion architecture and Source to Target mapping to load data into Data warehouse.
- Extensively leveraged teh Talend Big Data components (tHDFSOutput, tPigmap, tHive, tHDFSCon) for Data Ingestion and Data Curation from several heterogeneous data sources.
- Worked with Data mapping team to understand teh source to target mapping rules.
- Prepared both High level and Low-level mapping documents.
- Analyzed teh requirements and framed teh business logic and implemented it using Talend.
- Involved in ETL design and documentation.
- Developed Talend jobs from teh mapping documents and loaded teh data into teh warehouse.
- Involved in end-to-end Testing of Talend jobs.
- Analyzed and performed data integration using Talend open integration suite. Experienced in architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
- Worked on analyzingHadoop clusterand different big data analytical and processing tools includingPig, Hive,Spark, and Spark Streaming.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Environment: Talend 6.x, XML files, DB2, Oracle 11g, Netezza 4.2, SQL server 2008, SQL, MS Excel, MS Access, UNIX Shell Scripts, Talend Administrator Console, Cassandra, Oracle, Jira, SVN, Quality Center, and Agile Methodology, TOAD, Autosys.
Confidential, Hartford, CT
Product Lead
Responsibilities:
- Worked on SSAS in creating data sources, data source views, named queries, calculated columns, cubes, dimensions, roles and deploying of analysis services projects.
- SSAS Cube Analysis using MS-Excel and PowerPivot.
- Implemented SQL Server Analysis Services (SSAS) OLAP Cubes with Dimensional Data Modeling Star and Snow Flakes Schema.
- Developed standards for ETL framework for teh ease of reusing similar logic across teh board.
- Analyze requirements, create design and deliver documented solutions that adhere to prescribed Agile development methodology and tools.
- Responsible for creating fact, lookup, dimension, staging tables and other database objects like views, stored procedure, function, indexes and constraints.
- Developed complex Talend ETL jobs to migrate teh data from flat files to database.
- Implemented custom error handling in Talend jobs and also worked on different methods of logging.
- Followed teh organization defined Naming conventions for naming teh Flat file structure, Talend Jobs and daily batches for executing teh Talend Jobs.
- Exposure of ETL methodology for supporting Data Extraction, Transformation and Loading process in a corporate-wide ETL solution using Talend Open Source for Data Integration 5.6. worked on real time Big Data Integration projects leveraging Talend Data integration components.
- Analyzed and performed data integration using Talend open integration suite.
- Wrote complex SQL queries to inject data from various sources and integrated it with Talend.
- Worked on Talend Administration Console (TAC) for scheduling jobs and adding users.
- Worked on Context variables and defined contexts for database connections, file paths for easily migrating to different environments in a project.
- Developed mappings to extract data from different sources like DB2, XML files are loaded into Data Mart.
- Created complex mappings by using different transformations like Filter, Router, lookups, Stored procedure, Joiner, Update Strategy, Expressions and Aggregator transformations to pipeline data to Data Mart.
- Involved in designing Logical/Physical Data Models, reverse engineering for teh entire subject across teh schema.
- Scheduling and Automation of ETL processes with scheduling tool in Autosys and TAC.
- Scheduled teh workflows using Shell script.
- Used Talend in most used components (tMap, tDie, tConvertType, tFlowMeter, tLogCatcher, tRowGenerator, tSetGlobalVar, tHashInput&tHashOutput and many more)
- Created many complex ETL jobs for data exchange from and to Database Server and various other systems including RDBMS, XML, CSV, and Flat file structures.
- Developed stored procedure to automate teh testing process to ease QA efforts and reduced teh test timelines for data comparison on tables.
- Automated SFTP process by exchanging SSH keys between UNIX servers.
- Worked Extensively on Talend Admin Console and Schedule Jobs in Job Conductor.
- Involved in production n deployment activities, creation of teh deployment guide for migration of teh code to production, also prepared production run books.
- Creating Talend Development Standards. dis document describes teh general guidelines for Talend developers, teh naming conventions to be used in teh Transformations and development and production environment structures.
Environment: Talend 5.x,5.6, XML files, DB2, Oracle 11g, SQL server 2008, SQL, MS Excel, MS Access, UNIX Shell Scripts, TOAD, Autosys.
Confidential, Richardson, TX
Sr. Data/Business Analyst
Responsibilities:
- Implemented File Transfer Protocol operations using Talend Studio to transfer files in between network folders.
- Experienced in fixing errors by using debug mode of Talend.
- Created complex mappings using tHashOutput, tMap, tHashInput, tDenormalize, tUniqueRow. tPivot To Columns Delimited, tNormalize etc.
- Schedule teh Talend jobs with Talend Admin Console, setting up best practices and migration strategy.
- Used components like tJoin, tMap, tFilterRow, tAggregateRow, tSortRow, Target Connections and Source Connections.
- Mapping source files and generating Target files in multiple formats like XML, Excel, CSV etc.
- Transform teh data and reports retrieved from various sources and generating derived fields.
- Reviewed teh design and requirements documents with architects and business analysts to finalize teh design.
- Created WSDL data services using Talend ESB.
- Created Rest Services using tRESTRequest and tRESTResponse components.
- Used tESBConsumer component to call a method from invoked Web Service.
- Implemented few java functionalities using tJava and tJavaFlex components.
- Developed shell scripts, PL/SQL procedures for creating/dropping of table and indexes of performance.
- Attending teh technical review meetings.
- Implemented Star Schema for De-normalizing data for faster data retrieval for Online Systems.
- Involved in unit testing and system testing and preparing Unit Test Plan (UTP) and System Test Plan (STP) documents.
- Responsible for monitoring all teh jobs that are scheduled, running completed and failed. Involved in debugging teh jobs that failed using debugger to validate teh jobs and gain troubleshooting information about data and error conditions.
- Performed metadata validation, reconciliation and appropriate error handling in ETL processes.
- Developed various reusable jobs and used as sub-jobs in other jobs.
- Used Context Variable to increase teh efficiency of teh jobs
- Extensive use of SQL commands with TOAD environment to create Target tables.
Environment: Talend 5.1, Oracle 11g, DB2, Sybase, MS Excel, MS Access, TOAD, SQL, UNIX
Confidential, Bergen, TX
Team Lead
Responsibilities:
- Collaborated with teh Business analysts and teh DBA for requirements gathering, business analysis, testing and project coordination.
- Created PL/SQL Stored Procedures, Functions, Triggers and Packages to transfer teh data from teh Implemented File Transfer Protocol operations using Talend Studio to transfer files in between network folders.
- Experienced in fixing errors by using debug mode of Talend.
- Created complex mappings using tHashOutput, tMap, tHashInput, tDenormalize, tUniqueRow. tPivot To Columns Delimited, tNormalize etc.
- Schedule teh Talend jobs with Talend Admin Console, setting up best practices and migration strategy.
- Used components like tJoin, tMap, tFilterRow, tAggregateRow, tSortRow, Target Connections and Source Connections.
- Mapping source files and generating Target files in multiple formats like XML, Excel, CSV etc.
- Transform teh data and reports retrieved from various sources and generating derived fields.
- Reviewed teh design and requirements documents with architects and business analysts to finalize teh design.
- Created WSDL data services using Talend ESB.
- Created REST Services using tRESTRequest and tRESTResponse components.
- Used tESBConsumer component to call a method from invoked Web Service.
- Implemented few java functionalities using tJava and tJavaFlex components.
- Developed shell scripts, PL/SQL procedures for creating/dropping of table and indexes of performance.
- Attending teh technical review meetings.
- Implemented Star Schema for De-normalizing data for faster data retrieval for Online Systems.
- Involved in unit testing and system testing and preparing Unit Test Plan (UTP) and System Test Plan (STP) documents.
- Responsible for monitoring all teh jobs that are scheduled, running completed and failed. Involved in debugging teh jobs that failed using debugger to validate teh jobs and gain troubleshooting information about data and error conditions.
- Performed metadata validation, reconciliation and appropriate error handling in ETL processes.
- Developed various reusable jobs and used as sub-jobs in other jobs.
- Used Context Variable to increase teh efficiency of teh jobs.
- Extensive use of SQL commands with TOAD environment to create Target tables.
- Developed Shell scripts to automate execution of SQL scripts to check incoming data with master tables, insert teh valid data into Customer Management System and invalid data into error tables which will be sent back to sender notifying teh errors.
- Created data mapping files for teh data coming from different web services.
- Developed SQL and PL/SQL scripts to transfer tables across teh schemas and databases.
- Developed Procedures for efficient error handling process by capturing errors into user managed tables.
- Extensively used Cursors, User-defined Object types, Records, and Tables in PL/SQL Programming for generating worksheets.
- Involved extensively in Tuning SQL queries and PLSQL scripts by using SQL rewrite, partitions and Indexes.
- Created Views, Materialized Views and Indexes for better performance of summary tables, Autonomous Transactions, Coding Dynamic SQL Statements.
- Worked with Java developers to repair and enhance current base of PL/SQL packages to fix production issues, build new functionality and improve processing time through code optimizations and indexes.
- Used UTI FILE PACKAGES for writing DBMS OUTPUT messages to file.
- Supported for code integration and system integration testing for dis application.
- Worked with ETL team involved in loading data to staging area. Provided all business rules for teh database for loading data.
- Worked for ETL process of data loading from different sources and data validation process from staging area to catalog database.
- Created UNIX shell scripts for Encrypting and Decrypting data files.
- Worked with DBA to implement compression on Database tables.
- Implemented Interval range and Hash partitions to help improve teh performance of SQL queries.
Confidential, Charlotte
Business/Systems Analyst
Responsibilities:
- Interacted with teh Business Analyst Leads in understanding teh business requirements for teh project.
- Gatheird teh requirements from teh users and analyzed their business needs and created SRS documents.
- Designed, created and tested enhancements to a high performance OLTP client server system processing for both daily and scheduled financial transactions.
- Extensively used advanced PL/SQL concepts (e.g., arrays, PL/SQL tables, cursors, user defined object types, exception handling, database packages, nested tables) to manipulate data
- Created indexes on tables to improve performance of teh queries
- Used BULK COLLECT to fetch large volume data.
- Developed database objects including tables, clusters, Indexes, views, sequences, packages, triggers and procedures to troubleshoot any database problems.
- Involved in SDLC including designing, coding and testing.
- Collaborated with Data Warehouse Architecture in writing PL/SQL scripts, shell programs, Data Flows.
- Involved in logical modelling and physical modelling of application.
- Extracted data from Flat files and transformed it in accordance with teh Business logic mentioned by teh client using SQL*Loader.
- Used Ref cursors & Collections for accessing complex data resulted from joining of large no. of tables.
- Involved in validating teh data while data migration by creating PL/SQL Packages, Procedures, Functions, Triggers.
- Created Indexes for faster retrieval of customer information and enhance teh database performance.
- Used complex SQL queries including inline queries and sub queries for faster data retrieval.
- Worked with Bulk Collect to Implement teh performance of multi row queries
- Used Exception Handling extensively to debug and display teh error messages in teh application.
- Created and modified several UNIX Shell Scripts according to teh changing needs of teh project and client requirements.
Confidential
Sr. Developer
Responsibilities:
- Extensively used advanced PL/SQL concepts (e.g., arrays, PL/SQL tables, cursors, user defined object types, exception handling, database packages, nested tables) to manipulate data
- Created indexes on tables to improve performance of teh queries
- Used BULK COLLECT to fetch large volume data.
- Developed database objects including tables, clusters, Indexes, views, sequences, packages, triggers and procedures to troubleshoot any database problems.
- Involved in SDLC including designing, coding and testing.
- Collaborated with Data Warehouse Architecture in writing PL/SQL scripts, shell programs, Data Flows.
- Involved in logical modelling and physical modelling of application.
- Extracted data from Flat files and transformed it in accordance with teh Business logic mentioned by teh client using SQL*Loader.