Sr. Data Architect/data Modeler Resume
Mechanicsburg, PA
SUMMARY
- Over 9 years of Experience working as a Data Architect/Data Modeling with emphasis on Data Mapping, Data Validation in Data Warehousing Environment.
- Proficient in Analyzing source system by examining the content, structure and internal relationships of legacy data sources like flat files, Excel, Oracle, Sybase, SQL Server, XML and DB2 databases.
- Experience with structured or un - structured data analysis and tools (SQL, Hadoop, Spark, No SQL, MYSQL, Maria DB, Hive, Pig, etc.
- Hands on experience with various Data Architect and ETL Architect, subsystem and patterns, including Change Date Capture, Slow Change Dimension, Data Cleansing, auditing and validation, etc.
- Experienced working in all phases of SDLC from analysis, design, coding, unit testing, system testing and user acceptance testing.
- Experienced with Data warehouse applications using Oracle, MSSQL Server on windows and UNIX platforms.
- Experienced in data transformation, data mapping from source to target database schemas and data cleansing procedures using InformaticaPower Center, Talend and Pentaho.
- Extensive experience in shell scripting Python, Perl, Ruby, or any other scripting language
- Strong experience with architecting highly performance databases using PostgreSQL, PostGIS, MYSQL and Cassandra.
- Experience in developing MapReduce Programs using ApacheHadoop for analyzing the Bigdata as per the requirement.
- Experienced in Teradata RDBMS using Fastload, Fast Export, Multiload, Tpump, Teradata SQL Assistance and BTEQ Teradata utilities.
- Facilitated and participated in Joint Application Development (JAD) sessions, and white board sessions to keep executive staff and the team members apprised of goals, project status, and resolving issues.
- Extensive experience in development of T-SQL, OLAP, PL/SQL, Stored Procedures, Triggers, Functions, Packages, performance tuning and optimization for business logic implementation.
- Familiarity with Crystal Reports, Cognos and SSRS - Query, Reporting, Analysis and Enterprise Information Management.
- Worked with SAS Procedures such as PROC SQL, PROC REPORT, PROC ACCESS, PROC SORT, PROC TRANSPOSE, PROC MERGE, PROC PRINT, PROC TABULATE etc.
- Proficient in System Analysis, ER/Dimensional Data Modeling, Database design and implementing RDBMS specific features.
- Experienced in Data Scrubbing/Cleansing, Data Quality, Data Mapping Data Profiling, Data Validation in ETL.
- Well versed in Normalization / De normalization techniques for optimum performance in relational and dimensional database environments.
- Experienced in developing Entity-Relationship diagrams and modeling Transactional Databases and Data Warehouse using tools like ERWIN, ER/Studio and Power Designer.
- Hands on experience with modeling using ERWIN in both forward and reverse engineering cases.
- Proficient in Oracle Tools and Utilities such as TOAD, SQL*Plus and SQL Navigator.
- Experience working with large database to perform complete data retrieval, manipulation using multiple input files using SAS data steps
- Excellent knowledge on creating reports on Pentaho Business Intelligence.
- Excellent understanding and working experience of industry standard methodologies like System Development Life Cycle (SDLC), as per Rational Unified Process (RUP), AGILE Methodologies.
- Experienced with both DDL and DML including Joins, Functions, Indexes, Views, Constraints, Primary Keys and Foreign Keys.
- Good understanding and exposure to SQL queries and PL/SQL stored procedures, triggers, functions and packages.
- Experienced in Data transformation and Data mapping from source to target database schemas and also data cleansing.
TECHNICAL SKILLS
Programming Languages: SQL, PL/SQL, UNIX shell Scripting, PERL, AWK, SED
Databases: Oracle10/11g/12c, TeradataR12 R13 R14.10, MS SQL Server 2005,2008,2012 DB2, Netezza
Tools: MS-Office suite (Word, Excel, MS Project and Outlook), VSS
Big Data: Hadoop, HDFS 2, Hive, Pig, HBase, Sqoop, Flume.
Testing and defect tracking Tools: HP/Mercury (Quality Center, Win Runner, Quick Test Professional, Performance Center, Requisite, MS Visio & Visual Source Safe
Operating System: Windows, UNIX, Sun Solaris
ETL/Data warehouse Tools: Informatica 9.6.1, 9.1,9.5,8.6 SAP Business Objects XIR3.1/XIR2, Web Intelligence, Talend, Tableau, Pentaho
Data Modeling: Star-Schema Modeling, Snowflake-Schema Modeling, FACT and dimension tables, Pivot Tables, Erwin
Tools: & Software: TOAD, MS Office, BTEQ, Teradata SQL Assistant.
PROFESSIONAL EXPERIENCE
Confidential - Mechanicsburg, PA
Sr. Data Architect/Data Modeler
Responsibilities:
- Worked as a Data Architect/ Modeler to generate Data Models using Erwin and developed relational database system.
- Assisted in designing solutions for the Data Infrastructure .
- Worked on Software Development Life Cycle (SDLC), testing methodologies, resource management and scheduling of tasks.
- Extensively created data pipelines in cloud using Azure Data Factory.
- Working with Azure Data Warehouse, Azure Storage Accounts, Azure Data Factories, Azure Data Lake, U-SQL and Blob storage.
- Data sources are extracted, transformed and loaded to generate CSV data files with Python programming and SQL queries.
- Analyzed requirements and transform them into corresponding conceptual, logical and physical data models.
- Developed and supported on Oracle, SQL, and PL/SQL and T-SQL queries.
- Involved in several facets of MDM implementations including Data Profiling, metadata acquisition and data migration.
- Researched, evaluated, architect, and deployed new tools, frameworks and patterns to build sustainable Big Data platforms for our clients.
- Performed development and maintenance work on data models, data dictionaries or other artifacts to maintain currency with corporate standards, integration with Enterprise Data Warehouse architecture and firm goals.
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Implemented Spark Scripts using Scala, Spark SQL to access hive tables into spark for faster processing of data.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Performed data profiling, and business rules investigations on source data, to ensure correct design of the Integration and Access layers
- Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS.
- Imported data into Cassandra using PySpark, Scala to process the data.
- Consults on and researches data structure and data content issues and designs effective data solutions.
- Created and implements physical data solutions which may include database objects and SQL.
- Used forward engineering to create a Physical Data Model with DDL that best suits the requirements from the Logical Data Model.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
- Used Erwin for reverse engineering to connect to existing database and ODS to create graphical representation in the form of entity Relationships and elicit more information.
- Handled importing of data from various data sources, performed transformations using Hive & Map Reduce.
- Worked on Cloud computing using MS Azure with various BI Technologies and exploring NoSQL options for current back using Azure CosmosDB.
- Loaded data into HDFS and extracted the data from SQL into HDFS using Sqoop.
- Created HBase tables to load data coming from No SQL and a variety of portfolios.
- Developed multiple Map Reduce jobs for Data Cleaning and pre-processing analyzing data in Pig.
- Developed and interpret key performance indicators to measure the effectiveness of data management processes.
- Involved in writing the PL/SQL validation scripts to identify the data inconsistencies in the sources.
- Created Rich dashboards using Tableau Dashboard and prepared user stories to create compelling dashboards to deliver actionable insights.
Environment: Erwin 9.8, Oracle12c, Agile, MDM, Ralph Kimball's, Azure, ODS, Python, SQL, PL/SQL, OLTP, Hadoop3.0, Hive2.3, T-SQL, Map Reduce, HDFS, Hbase1.2, Pig0.17, Cassandra, Cosmos, PySpark, Tableau, PL/SQL, SQL server
Confidential, Kansas, KS
Sr. Data Architect/Data Modeler
Responsibilities:
- Developed and refine business and technical requirements to identify and document detailed data requirements, definitions, data lineage and relevant business rules.
- Used existing Deal Model in Python to inherit and create object data structure for regulatory reporting.
- Attended numerous meeting to understand the Healthcare Domain and the concepts related to the project (Healthcare Informatics).
- Architected overall master data hub for data elements that are used by multiple IT systems.
- Implemented Data warehouse solutions in AWS Redshift; Worked on various projects to migrate data from on premise databases to AWS Redshift, RDS and S3.
- Design of Big Data platform technology architecture. The scope includes data intake, data staging, data warehousing, and high performance analytics environment.
- Experienced in generating and documenting Metadata while designing OLTP and OLAP systems environment
- Used the Agile Scrum methodology to build the different phases of Software development life cycle.
- Massively involved in Data Architect role to review business requirement and compose source to target data mapping documents.
- Provide scientific/ technical support and training for the ongoing development of a core set, and several disease specific sets, of Common Data Elements (CDEs) for use in NINDS
- Manage a portfolio of clients in alignment with the Client Delivery Executives (CDEs)
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snowflake Schemas.
- Perform administrative tasks, including creation of database objects such as database, tables, and views, using SQLDCL, DDL, and DML requests.
- Involved in several facets of MDM implementations including Data Profiling, Metadata acquisition and data migration.
- Installation and Configuration of other Open Source Software like Pig, HBase, Flume and Sqoop.
- Working on Amazon Redshift and AWS and architecting a solution to load data, create data models and run BI on it.
- Ability to use technical expertise to identify and address key issues and determine resolution strategies aligned to the organization's overall mission and strategy
- Created logical and physical data models using Erwin9.6 for new requirements and existing databases, maintained database standards, data design, integration, migration and analyzed data in different systems.
- Used BTEQ script to create a sample tables. Redefine the partitioning of a populated table.
- Generated data models to make extracts for SAS sourcing data from Enterprise Data Warehouse.
- Worked with DBA to create Best-Fit Physical Data Model from the logical Data Model using Forward Engineering in Erwin.
- Responsible for technical data governance, enterprise wide data modeling and database design.
- Converted ad-hoc/manual reporting system into high performance reporting system using SSASOLAP cubes.
- Worked on designing a Star schema for the detailed data marts and plan data marts involving confirmed dimensions.
- Involved in database development by creating Oracle PL/SQL Functions, Procedures and Collections
Environment: Erwin r9.6, Oracle 12c, Teradata15, MDM, Pig, HBase, Sqoop, ODS, SQL Assistant, MS Visio, Spark, AWS Redshift, Agile, Windows 8, ETL, PL/SQL, Metadata, SSAS, OLAP, OLTP
Confidential, Iowa City, IA
Sr. Data Analyst/Data Modeler
Responsibilities:
- Worked on the integration of existing systems at Data warehouse and Application systems level.
- Responsible for data quality and procedures; highlights and resolves data discrepancies across platform.
- Extensively used SQL for Data Analysis and to understand and documenting the data behavior.
- Reversed engineered existing data bases to understand the data flow and business flows of existing systems and to integrate the new requirements to future enhanced and integrated system.
- Designed the procedures for getting the data from all systems to Data Warehousing system.
- Worked with ETL Architects and developer to design performance centric ETL mappings.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
- Extensively worked on documentation of Data Model, Mapping, Transformations and Scheduling jobs.
- Worked extensively with Business Objects Report developers in creating data marts and develop reports to cater the existing business needs.
- Designed Mapping Documents and Mapping Templates for Informatica ETL developers.
- Extensively used and created SAS/Macros for the efficiency and accuracy.
- Deployed naming standard to the Data Models at enterprise level and followed company standard for Project Documentation.
- Using the data Integration tool Pentaho for designing ETL jobs in the process of building Data warehouses and Data Marts.
- Designing ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and convert them to physical data model including capacity planning, object creation and aggregation strategies, partition strategies, Purging strategies as per new architecture.
- Designed and developed strategies for Data Conversions and Data Cleansing
- Created Data mappings, Tech Design, loading strategies for ETL to load newly created or existing tables.
- Extensively used Agile methodology as the Organization Standard to implement the data Models.
- Created Schema objects like Indexes, Views, and Sequences, triggers, grants, roles, Snapshots.
- Developed Star and Snowflake schemas based dimensional model to develop the data warehouse.
- Developed statistics and visual analysis for warranty data using MS Excel, MS Access and Tableau Software.
- Developed strategies and loading techniques for better loading and faster query performance.
Environment: Erwin 8.2, InformaticaPowerCenter9.1, Pentaho, IBM Cognos8.4 Report Net (Framework Manager, Tableau, Report Studio, SAS, Query Studio), Oracle 11g, DB2, MS SSRS/SSAS PL/SQL, T-SQL, Flat Files, MS Visio
Confidential, Charlotte, NC
Sr. Data Analyst / Data Modeler
Responsibilities:
- Used and supported database applications and tools for extraction, transformation and analysis of raw data
- Analyzed business requirements, system requirements, data mapping requirement specifications, and responsible for documenting functional requirements and supplementary requirements in Quality Center.
- Developed, managed and validated existing data models including logical and physical models of the data warehouse and source systems utilizing a 3NFmodel and dimensional data model.
- Assist in semantic layer design and development of Semantic layer data model.
- Extensively used Star Schema methodologies in building and designing the logical data model into Dimensional Models
- Created data masking mappings to mask the sensitive data between production and test environment.
- Used Teradata SQL Assistant, Teradata Administrator, PMON and data load/export utilities like BTEQ, Fast load, Multi Load, Fast Export, Tpump on UNIX/Windows environments and running the batch process for Teradata.
- Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
- Performed data analysis and data profiling using complex SQL on various databases such as Oracle and Teradata.
- Performed various data analysis at the source level and determined the key attributes for designing of Fact and Dimension tables using star schema for an effective Data Warehouse and Data Mart.
- Written several shell scripts using UNIX Korn shell for file transfers, error logging, data archiving, checking the log files and cleanup process.
- Worked with end users to gain an understanding of information and core data concepts behind their business.
- Responsible for creating test cases to make sure the data originating from source is making into target properly in the right format.
- Experience in creating UNIX scripts for file transfer and file manipulation.
- Worked on data modeling and produced data mapping and data definition specification documentation.
- Used existing UNIX shell scripts and modified them as needed to process SAS jobs, search strings, execute permissions over directories etc.
- Excellent understanding and working experience of industry standard methodologies like System Development Life Cycle (SDLC), as per Rational Unified Process (RUP), AGILE Methodologies.
- Extensively completed data quality management using information steward and did extensive data profiling.
- Extensively worked on Talend Designer Components-Data Quality (DQ), Data Integration (DI) and Master Data Management (MDM)
- Wrote simple and advanced SQL queries and scripts to create standard and ad hoc reports for senior managers.
- Identify source systems, their connectivity, related tables and fields and ensure data suitably for mapping
Environment: Erwin, SQL Server 2008, Oracle 11g, 3NF, Rational Rose, Informatica 9.6.1, Data Flux, WSDL, Data mining, Pentaho, Quality Center 8.2, SQL, Talend, TOAD, PL/SQL, Flat Files, Trillium, SAS, SOAP, TeradataR14
Confidential, SFO, CA
Sr. Data Analyst/Data Modeler
Responsibilities:
- Involved in extensive Data validation using SQL queries and back-end testing
- Used Erwin, Created Conceptual, Logical and Physical data models.
- Involved in analysis of a variety of source system data, coordination with subject matter experts, development of standardized business names and definitions, construction of a non-relational data model using Erwin modeling tool, publishing of a data dictionary, review of the model and dictionary with subject matter experts and generation of data definition language.
- Involved incomplete SDLC processes involving requirements management, workflow analysis, source data analysis, data mapping, metadata management, data quality, testing strategy and maintenance of the model.
- Created ETL framework and provided strategy for data cleansing, data quality and data consolidation.
- Worked extensively in Teradata data modeling for both logical and physical data models.
- Developed ETL process using Pentaho PDI to extract the data from legacy System.
- Designed various Informatica ETL load patterns which include SCD TypeI, Type II, full load, etc.
- Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, SQL Server with high volume data
- Performed data management projects and fulfilling ad-hoc requests according to user specifications by utilizing data management software programs and tools like Perl, Toad, MS Access, Excel and SQL
- Worked on Naming standards for Table/Column/Index/Constraints names thru Erwin Macros and Master Abbreviations file.
- Wrote SQL scripts to test the mappings and Developed Traceability Matrix of Business Requirements mapped to Test Scripts to ensure any Change Control in requirements leads to test case update.
- Experienced in testing Business Intelligence reports generated by various BI Tools like Tableau, Micro strategy and Business Objects.
- Involved in extensive Data Validation with SQL queries.
- Worked on Physical design for both SMP and MPPRDBMS, with understanding of RDMBS scaling features
Environment: Quality Center 9.2, MS Excel 2007, PL/SQL, Java, Business Objects XIR2, ETL Tools Informatica8.6.1, TeradataR13, Pentaho, CA ERWIN7.3, Subversion, MS-Office, Tableau, Oracle 10g, SQL Server 2005,, SQLPLUS, PL/SQL
Confidential, Plymouth, PA
Sr. Data Analyst
Responsibilities:
- Worked with data source systems and Client systems to identify data issues, data gaps, identified and recommended solutions.
- Performed data analysis as required to create ad-hoc reports.
- Created, defined, and maintained data life-cycle documentation representing data elements across multiple upstream and downstream systems.
- Proactively communicated and collaborated with external and internal customers to analyze information needs.
- Acted as liaison between the business units, technology teams and support teams.
- Responsible for researching data quality issues (inaccuracies in data), worked with business owners/stakeholders to assess business and risk impact, and provided solution to business owners.
- Designed Star schema for Risk Retail reporting for credit card portfolio subject area.
- Analyzed functional and non-functional categorized data elements for data profiling and mapping from source to target data environment.
- Involved with data profiling for multiple sources and answered complex business questions by providing data to business users.
- Extracted data from various sources like Oracle, Mainframes, flat files and loaded into the target Netezza database using NzSQL and NzLoad
- Created and executed automated tests for functional, and regression testing using Quick Test Professional using VB scripts.
- Executed SQL queries for validation of data and used TOAD for SQL Server to write SQL queries for validating constraints, indexes.
- Experienced in data analysis using SQL, PL/SQL and many other queries based applications.
- Extensively used ETL methodology for supporting data extraction, transformations and loading processing, in a complex EDW using Informatica.
- Performed dicing and slicing on data using Pivottables to acquire the churn rate pattern and prepared reports as required.
Environment: PL/SQL, Informatica8.6, Oracle 10g, Netezza, Aginity, Excel, Business Objects, Access, ERWIN data modeler, SAS, Agile, UNIX, OBIEE, Tableau, UNIX, XML SPY, soap UI, DTS, MS Office
Confidential
Data Analyst
Responsibilities:
- Developed logical data model from the conceptual model using ERWIN.
- Developed physical database model using ERWIN.
- Used ERWIN's Forward Engineering tool to create tables and columns.
- Generated DDL statements for the newly generated objects (tables, views, indexes and packages)
- Developed star schemas for dimensional modeling using ERWIN.
- Used ETL methodology for data extraction, transformations and loading in a complex Enterprise Data Warehouse (EDW)
- Identified and documented data sources and transformation rules required to populate and maintain data warehouse content.
- Involved in extracting, cleansing, transforming, integrating, and loading data into different Data Marts using Data Stage Designer.
- Used Erwin' reverse engineering to connect to existing database and ODS to graphically represent the Entity Relationships.
- Implemented Reverse Engineering technique to transfer data from database to ERWIN. Also cleaned unwanted tables/columns and redefined attributes and relationships.
- Involved in Performance tuning by leveraging oracle explain utility and SQL tuning.
Environment: ERWIN 7.2, Oracle 9i, SQL server, MS Excel, MS Visio, Rational Rose, Requisite Pro
