Sr. Data Architect Resume
Mechanicsburg, PA
SUMMARY
- Over 10 years of Experience working as a Data Architect/Data Modeling with emphasis on Data Mapping, Data Validation in Data Warehousing Environment.
- Proficient in Analyzing source system by examining the content, structure and internal relationships of legacy data sources like flat files, Excel, Oracle, Sybase, SQL Server, XML and DB2 databases.
- Experience with structured or un - structured data analysis and tools (SQL, Hadoop, Spark, No SQL, MYSQL, Maria DB, Hive, Pig, etc.
- Hands on experience with various Data Architect and ETL Architect, subsystem and patterns, including Change Date Capture, Slow Change Dimension, Data Cleansing, auditing and validation, etc.
- Experienced working in all phases of SDLC from analysis, design, coding, unit testing, system testing and user acceptance testing.
- Experienced with Data warehouse applications using Oracle, MSSQL Server on windows and UNIX platforms.
- Experienced in data transformation, data mapping from source to target database schemas and data cleansing procedures using Informatica Power Center, Talend and Pentaho.
- Extensive experience in shell scripting Python, Perl, Ruby, or any other scripting language
- Strong experience with architecting highly performance databases using PostgreSQL, PostGIS, MYSQL and Cassandra.
- Experience in developing MapReduce Programs using Apache Hadoop for analyzing the Bigdata as per the requirement.
- Experienced in Teradata RDBMS using Fastload, Fast Export, Multiload, Tpump, Teradata SQL Assistance and BTEQ Teradata utilities.
- Facilitated and participated in Joint Application Development (JAD) sessions, and white board sessions to keep executive staff and the team members apprised of goals, project status, and resolving issues.
- Extensive experience in development of T-SQL, OLAP, PL/SQL, Stored Procedures, Triggers, Functions, Packages, performance tuning and optimization for business logic implementation.
- Experience in metadata design, real-time BI Architecture including Data Governance for greater ROI.
- Familiarity with Crystal Reports, Cognos and SSRS - Query, Reporting, Analysis and Enterprise Information Management.
- Worked with SAS Procedures such as PROC SQL, PROC REPORT, PROC ACCESS, PROC SORT, PROC TRANSPOSE, PROC MERGE, PROC PRINT, PROC TABULATE etc.
- Proficient in System Analysis, ER/Dimensional Data Modeling, Database design and implementing RDBMS specific features.
- Experienced in Data Scrubbing/Cleansing, Data Quality, Data Mapping Data Profiling, Data Validation in ETL.
- Well versed in Normalization / De normalization techniques for optimum performance in relational and dimensional database environments.
- Experienced in developing Entity-Relationship diagrams and modeling Transactional Databases and Data Warehouse using tools like ERWIN, ER/Studio and Power Designer.
- Hands on experience with modeling using ERWIN in both forward and reverse engineering cases.
- Proficient in Oracle Tools and Utilities such as TOAD, SQL*Plus and SQL Navigator.
- Experience working with large database to perform complete data retrieval, manipulation using multiple input files using SAS data steps
- Excellent knowledge on creating reports on Pentaho Business Intelligence.
- Excellent understanding and working experience of industry standard methodologies like System Development Life Cycle (SDLC), as per Rational Unified Process (RUP), AGILE Methodologies.
- Good understanding of AWS, big data concepts and Hadoop ecosystem.
- Experienced with both DDL and DML including Joins, Functions, Indexes, Views, Constraints, Primary Keys and Foreign Keys.
- Good understanding and exposure to SQL queries and PL/SQL stored procedures, triggers, functions and packages.
- Experienced in Data transformation and Data mapping from source to target database schemas and also data cleansing.
TECHNICAL SKILLS
Programming Languages: SQL, PL/SQL, UNIX shell Scripting, PERL, AWK, SED
Databases: Oracle10/11g/12c, TeradataR12 R13 R14.10, MS SQL Server 2005,2008,2012 DB2, Netezza
Tools: MS-Office suite (Word, Excel, MS Project and Outlook), VSS
Big Data: Hadoop, HDFS 2, Hive, Pig, HBase, Sqoop, Flume.
Testing and defect tracking Tools: HP/Mercury (Quality Center, Win Runner, Quick Test Professional, Performance Center, Requisite, MS Visio & Visual Source Safe
Operating System: Windows, UNIX, Sun Solaris
ETL/Data warehouse Tools: Informatica 9.6.1, 9.1,9.5,8.6 SAP Business Objects XIR3.1/XIR2, Web Intelligence, Talend, Tableau, Pentaho
Data Modeling: Star-Schema Modeling, Snowflake-Schema Modeling, FACT and dimension tables, Pivot Tables, Erwin
Tools & Software: TOAD, MS Office, BTEQ, Teradata SQL Assistant.
PROFESSIONAL EXPERIENCE
Confidential - Mechanicsburg, PA
Sr. Data Architect
Responsibilities:
- Worked as a Data Architect/ Modeler to generate Data Models using Erwin and developed relational database system.
- Assisted in designing solutions for the Data Infrastructure.
- Worked on Software Development Life Cycle (SDLC), testing methodologies, resource management and scheduling of tasks.
- Extensively created data pipelines in cloud using Azure Data Factory.
- Working with Azure Data Warehouse, Azure Storage Accounts, Azure Data Factories, Azure Data Lake, U-SQL and Blob storage.
- Data sources are extracted, transformed and loaded to generate CSV data files with Python programming and SQL queries.
- Analyzed requirements and transform them into corresponding conceptual, logical and physical data models.
- Developed and supported on Oracle, SQL, and PL/SQL and T-SQL queries.
- Involved in several facets of MDM implementations including Data Profiling, metadata acquisition and data migration.
- Researched, evaluated, architect, and deployed new tools, frameworks and patterns to build sustainable Big Data platforms for our clients.
- Performed development and maintenance work on data models, data dictionaries or other artifacts to maintain currency with corporate standards, integration with Enterprise Data Warehouse architecture and firm goals.
- Designed and developed data architecture solutions in big data architecture or data analytics.
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Involved in developing Database Design Document including Data Model Conceptual, Logical and Physical Models using Erwin 9.64.
- Worked with structured/semi-structured data ingestion and processing on AWS using S3, Python and Migrate on-premises big data workloads to AWS.
- Implemented Spark Scripts using Scala, Spark SQL to access hive tables into spark for faster processing of data.
- Evaluate architecture patterns, Define best patterns for data usage, data security, data compliance.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Performed data profiling, and business rules investigations on source data, to ensure correct design of the Integration and Access layers
- Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS.
- Added support for Amazon AWS S3 and RDS to host static/media files and the database into Amazon Cloud.
- Imported data into Cassandra using PySpark, Scala to process the data.
- Consults on and researches data structure and data content issues and designs effective data solutions.
- Created and implements physical data solutions which may include database objects and SQL.
- Used forward engineering to create a Physical Data Model with DDL that best suits the requirements from the Logical Data Model.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture.
- Used Erwin for reverse engineering to connect to existing database and ODS to create graphical representation in the form of entity Relationships and elicit more information.
- Handled importing of data from various data sources, performed transformations using Hive & Map Reduce.
- Architect and designed the data flow for the collapse of 4 legacy data warehouses into an AWS Data Lake
- Worked on Cloud computing using MS Azure with various BI Technologies and exploring NoSQL options for current back using Azure CosmosDB.
- Loaded data into HDFS and extracted the data from SQL into HDFS using Sqoop.
- Created HBase tables to load data coming from No SQL and a variety of portfolios.
- Developed multiple Map Reduce jobs for Data Cleaning and pre-processing analyzing data in Pig.
- Developed and interpret key performance indicators to measure the effectiveness of data management processes.
- Involved in writing the PL/SQL validation scripts to identify the data inconsistencies in the sources.
- Created Rich dashboards using Tableau Dashboard and prepared user stories to create compelling dashboards to deliver actionable insights.
Environment: Erwin 9.8, Oracle12c, Agile, MDM, Ralph Kimball's, Azure, ODS, Python, SQL, PL/SQL, OLTP, Hadoop3.0, Hive2.3, T-SQL, Map Reduce, HDFS, Hbase1.2, Pig0.17, Cassandra, Cosmos, PySpark, Tableau, PL/SQL, SQL server
Confidential
Sr. Data Architect
Responsibilities:
- Developed and refine business and technical requirements to identify and document detailed data requirements, definitions, data lineage and relevant business rules.
- Used existing Deal Model in Python to inherit and create object data structure for regulatory reporting.
- Attended numerous meeting to understand the Healthcare Domain and the concepts related to the project (Healthcare Informatics).
- Architected overall master data hub for data elements that are used by multiple IT systems.
- Implemented Data warehouse solutions in AWS Redshift; Worked on various projects to migrate data from on premise databases to AWS Redshift, RDS and S3.
- Design of Big Data platform technology architecture. The scope includes data intake, data staging, data warehousing, and high performance analytics environment.
- Experienced in generating and documenting Metadata while designing OLTP and OLAP systems environment
- Used the Agile Scrum methodology to build the different phases of Software development life cycle.
- Massively involved in Data Architect role to review business requirement and compose source to target data mapping documents.
- Provide scientific/ technical support and training for the ongoing development of a core set, and several disease specific sets, of Common Data Elements (CDEs) for use in NINDS
- Manage a portfolio of clients in alignment with the Client Delivery Executives (CDEs)
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snowflake Schemas.
- Perform administrative tasks, including creation of database objects such as database, tables, and views, using SQLDCL, DDL, and DML requests.
- Involved in several facets of MDM implementations including Data Profiling, Metadata acquisition and data migration.
- Installation and Configuration of other Open Source Software like Pig, HBase, Flume and Sqoop.
- Developed and configured on Informatica MDM hub supports the Master Data Management (MDM), Business Intelligence (BI) and Data Warehousing platforms to meet business needs.
- Working on Amazon Redshift and AWS and architecting a solution to load data, create data models and run BI on it.
- Ability to use technical expertise to identify and address key issues and determine resolution strategies aligned to the organization's overall mission and strategy
- Created logical and physical data models using Erwin9.6 for new requirements and existing databases, maintained database standards, data design, integration, migration and analyzed data in different systems.
- Used BTEQ script to create a sample tables. Redefine the partitioning of a populated table.
- Generated data models to make extracts for SAS sourcing data from Enterprise Data Warehouse.
- Worked with DBA to create Best-Fit Physical Data Model from the logical Data Model using Forward Engineering in Erwin.
- Responsible for technical data governance, enterprise wide data modeling and database design.
- Converted ad-hoc/manual reporting system into high performance reporting system using SSASOLAP cubes.
- Worked on designing a Star schema for the detailed data marts and plan data marts involving confirmed dimensions.
- Involved in database development by creating Oracle PL/SQL Functions, Procedures and Collections
Environment: Erwin r9.6, Oracle 12c, Teradata15, MDM, Pig, HBase, Sqoop, ODS, SQL Assistant, MS Visio, Spark, AWS Redshift, Agile, Windows 8, ETL, PL/SQL, Metadata, SSAS, OLAP, OLTP