Sr. Data Quality Engineer Resume
Richardson, TX
SUMMARY:
- Over 7+ years of experience in business and data analysis, developing BI applications, statistical analysis and reporting in Banking, Construction, Healthcare and Retail industries using RDBMS, BI, and Big Data technologies.
- Excellent understanding of the Software Development Life Cycle (SDLC) and Testing Life Cycle (TLC) in Waterfall, Agile (Scrum), Adapt (Scrum + KANBAN), LEAN development environments.
- Experienced working in Cloud/DevOps environments (Azure & AWS).
- Extensively worked with tools like Jenkins and Team City to implement build automation.
- Expertized in using JIRA and Rally software with Jenkins and github for real time bug tracking and issue management.
- Experienced QA Analyst/ETL Analyst/Tester with excellent understanding of ETL processes, OLAP Systems, data warehousing, Star and Snowflake Schema and Slowly Changing Dimensions.
- Expert in writing complex SQL queries, PL/SQL to perform back - end testing on databases like Oracle, SQL Server, and Teradata and Big Data.
- Experienced in preparation and execution of Test Strategy, Test Plan, Test Scenarios, Test Procedure, Test Scripts, and Test Data for SIT and UAT.
- Experience working on complex, data-driven assignments that require an ability to understand and navigate large data structures, manipulate and cleanse data.
- Advanced knowledge and experience in relational databases (Teradata, DB2, SQL Server, Oracle), Data Warehousing and ETL/ELT technologies
- Performed analysis to help clients gain insights into their business and customer behavior through analysis of campaign data, transactional data, customer profile data etc.
- Experience in Cloudera Hadoop (CDH), Microsoft Azure cloud service and Hortonworks Hadoop environment.
- Experience in setting up connections to different Databases like oracle, SQL, DB2, HADOOP and according to user requirement.
- Experience in analyzing and translating business requirements to technical requirements and modeling the Logical and Physical Data Models.
- Experience in supporting and working with cross-functional teams in a dynamic environment.
- Experience in designing and implementation of continuous integration, continuous delivery, continuous deployment through Jenkins.
- Hands on experience in writing Python and Bash Scripts.
- Implemented various Test cases for Functional Testing, Regression Testing, Configuration Testing, Performance Testing, Back-End and User Acceptance Testing (UAT).
- Good experience using Teradata RDBMS, its utilities (SQL, BTEQ, FastLoad, Multiload, Fast Export, SQL Assistant, WinDDI).
- Hands on experience testing third party ETL tool jobs including IBM DataStage, Informatica and Talend.
- Excellent understanding of Software Development Life Cycle and dimensional modeling.
TECHNICAL SKILLS:
Databases: Teradata V2R6/12/13, Big Data, Hadoop, DB2, MS SQL Server 2005/2008, MS Access, Oracle 11g/10g/9i
Languages/Cloud: SQL, Hive Query Language (HQL), PL/SQL, ANSI SQL, XML, UNIX Shell, Python, Azure Data lake, Data factory Azure Databricks, Azure SQL database, Azure SQL Datawarehouse
Tools: SQL Assistant, IBM Datastage 9.1/8.5, Hive, JIRA, Rally, RTC, Secure CRT, Zena, Qtest, Secure FX, Beeline, Putty, DS Scheduler, HP ALM, TosaBI, Jenkins, TeamCity, GitLab
Other Tools: Microsoft office Professional 5.0/2002 and Microsoft Teams, Toad, Altova XML Spy, Talend, Note Pad ++
Data Modeling: Dimensional Data Modeling, Data Modeling, Star Join Schema Modeling, Snow-Flake Modeling, FACT and Dimensions Tables, Physical and Logical Data Modeling, Oracle Designer
Operating System: Windows XP/7, Windows 10, Linux, Macintosh, UNIX.
WORK EXPERIENCE:
Confidential, Richardson, TX
Sr. Data Quality Engineer
Responsibilities:
- Interacted with the SME’s of each project and gathered the migration information to develop the test strategy and user acceptance criteria.
- Worked on Agile Software development methodology and participated in daily scrum and seed meetings to update the progress to all stake holders to accomplish the sprint planned stories and tasks.
- Working with Ingestion team in translating business requirements to technical requirements and modeling the Logical and Physical Data Models.
- Performed ETL testing, Source to Confidential testing and other data flow testing against homogenous and heterogeneous databases like CSV file, XML file, JSON files into table or table to table.
- Responsible for creating complete test cases, test plans, test data, and reporting status ensuring accurate coverage of requirements and business processes.
- Worked with a scaled agile framework where Continuous Improvement is a key and understands the importance of deliveries in timely manner.
- Worked on Retail Membership, Claims, clinal and providers data for E2E testing and UAT.
- Worked on Medicare, Medicaid and MedSupp data validations and build automated source to Confidential scripts.
- Worked on GPD, HEDIS, TMG data which comes under Affordable Care Act Medical Insurance system for retail.
- Tested data quality, duplicates and counts for hive tables and files in Hadoop file system to ensure the source data is properly loaded and tested.
- Responsible for interfacing with research and advanced business teams to bring new Insurance Scenario concepts into production and then mocking up data to validate.
- Designed, Developed and executed test cases and test reports using QTEST, ToscaBI AND HP ALM.
- Reported bugs in JIRA and collaborated with developers to resolve them. Also ensured the defects are logged and linked properly with the JIRA stories and Test case requirements.
- Wrote SQL queries to perform compare data when it was moved from Teradata to Hadoop.
- Collaborated with the offshore team, reviewed their tasks and provided status updates to the onshore team on daily basis.
- Performed Confidential to Confidential testing after running the jobs in both 8.5 and 11.5 Datastage versions.
- Involving in writing complex HQL queries to verify data from Source to Confidential using Hive.
- Preparing documentation for some of the recurring defects and resolutions and business comments for those defects.
- Used the connector migrator tool before migration of the Datastage jobs.
- Familiar with Health Insurance Portability and Accountability Act (HIPAA regulation), Sensitive Personal Information (SPI), Protected Health Information (PHI) standards
- Tracking and reporting the issues to project team and management during test cycles.
- Updating and maintaining existing test Matrix and test cases based on code changes and enhancements to assigned applications.
- Contribute in the development of knowledge transfer documentation.
Environment: Teradata SQL Assistant, Hadoop (HDFS), Hive, Data Lake, Agile, QTEST, IBM - Rational Team Concert (RTC), IBM - DataStage, Jira, DB2, Zena, Datastage Connector Migration tool, UNIX, ToscaBI, WinSCP, Microsoft Access.
Confidential, Windsor Locks, CT
ETL Tester/ Data Engineer
Responsibilities:
- Collaborated daily with an Agile teams and work with system developers, database architects, application and principle systems engineers to break-down any issues or impediments relating to banking systems.
- Responsible for end-to-end regression testing to provide qualitative data for UAT and Business.
- Analyze, design and build Modern data solutions using Azure PaaS service to support visualization of data. Understand current Production state of application and determine the impact of new implementation on existing business processes.
- Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics . Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks.
- Performed data quality tasks in a Continuous integration environment built around Jenkins and Bit Bucket.
- Prepare quality test cases, as well as test plans in accordance to Agile team deliverables.
- Involved in preparing Source to Confidential Mappings and other Technical Specification Documents for the ETL jobs.
- Extensively involved in Reconciliation and Recovery process for capturing the incremental changes in the source systems for updating in the staging area and data warehouse respectively.
- Developed Complex SQL queries to fetch the data from the Database as per the business requirement and use the SQL queries for further processing.
- Tested the Data Validation, Data Conditioning, and Interacting with Acxiom for standardizing Name and Address Information and entire Delta Processing for different Line of Businesses.
- Developed and executed test cases and test reports using HP ALM and ToscaBI.
- Use various high-level tools and databases in testing, triaging, trouble shooting, and error analysis.
- Identified various Data Sources and Development Environment.
- Worked as a team member for the testing of Automotive insights data mart and integrating it with Global Data Warehouse.
- Loaded data from operational data store (ODS) to data warehouse tables by writing and executing foreign key validation programs to validate where exactly star schema appears, with fact tables and dimensions/lookup tables.
- Worked on Teradata and Teradata utilities such as TPUMP, BTEQ, Fastload etc.
- Tested complex Talend mappings that loaded the data from various sources using different transformations like tMap, join files, expression, aggregator, filter, normalizer.
Environment: Cloudera Hadoop (CDH), Oracle 11g, Teradata 15, Teradata BTEQ, SQL Assistant, MS Visio, Toad, SQL * Loader, I, Informatica PowerCenter 9.6.2, Talend Open Studio 5.5, ToscaBI, HP ALM, UNIX Shell Scripts, Jenkins, Azure Data lake, Data factory, Azure Databricks, Azure SQL database, Azure SQL Datawarehouse, python
Confidential, Minneapolis, MN
Data Quality Analyst
Responsibilities:
- Involved in creating the test plan, strategy, and test scripts for testing the marketing data mart processes.
- Tested ETL, ELT, and ECTL processes with source systems as varied as oracle RDBMS, DB2, and flat files.
- Generate test data sets to validate business rules.
- BTEQ, Fastload, and Multiload were the primary Teradata utilities used in the ETL processes
- Created and modified UNIX Shell scripts for files processing and automation.
- Tested Business Objects reports for validity.
- Extensively used SQL Set operators like Intersect, Minus, Union, Union All to compare the data between the source and Confidential .
- Gathered, documented, and analyzed user requirements for the operational data marts.
- Acted as the tester for all the ETL jobs, reading data from the Corporate Information Factory, and other vendors, and finally loading dimension, fact and other aggregate tables.
- Co-coordinated improvements and enhancements to the data marts, by acting as a bridge between the users and the development teams.
- Performance tuned ETL Batch jobs, Reporting and Ad-hoc SQL.
- Involved in testing reporting solutions using Business Objects, BI Query and ad hoc SQL.
- Intensive use of Teradata BTEQ for Data Transformation and application of Business Rules to the source data.
- Provided on call support for the entire EDW processes. Solved numerous issues arising from batch processing in minimal time.
Environment: Oracle 11g, Teradata 14, BTEQ, Hadoop, Hive, Informatica 9.5, Teradata Fastload/BTEQ/MLOAD
Confidential, East Hartford, CT
Data Quality Analyst
Responsibilities:
- Interacting extensively with business Owners on requirement gathering, analysis and documentation.
- Met with Customers to determine User requirements and Business Goals.
- Blended technical and business knowledge with communication skills to bridge the gap between internal business and technical objectives and serve as an IT liaison with the business user constituents.
- Conducted JAD sessions to gather requirements, performed Use Case and work flow analysis, outlined business rules, and developed domain object models
- Created a conceptual design based on the interaction with the functional and technical team.
- Analyzed the existing data model and incorporated new additions for the advancement data into the data model identifying the cardinality of the new tables to the existing tables and ensure proper referential integrity of the system
- Performed report validation writing SQL queries against Teradata, and Oracle RDBM.
- Created process documents, reporting specs and templates, training material and slideshow presentations for the application development teams and managements
- Analysis of the data identifying source of data and data mappings of BHC.
- Transformation of requirements into data structures which can be used to efficiently store, manipulate and retrieve information
- Collaborate with data modelers, ETL developers in the creating the Data Functional Design documents.
- Created and maintained specifications and process documentation to produce the required data deliverables (data profiling, source to Client maps, flows).
- Worked in collaboration with various areas of the organization, identified additional stakeholder Requirements, and Documented requirements in a Software Requirements Specification document.
- Managed production support for critical processes and resolved complex execution issues and production problems on critical scenarios.
- Effectively prepared and presented various performance reports and presentation.
Environment: Oracle 10g, Teradata 14, SQL Server Assistant, Ultra Edit, Unix, Windows 7, Microsoft Azure, Altova XML Spy, Tableau, MS Excel.