Sr. Data Architect/data Modeler Resume
Dallas, TX
SUMMARY:
- Over 9+ years of strong IT experienced in Data Architecture, Data Modeling, and Big Data Reporting Design and Development.
- Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export
- Experienced in analyzing data using Hadoop Ecosystem including HDFS, Hive, Spark, Spark Streaming, Elastic Search, Kibana, Kafka, HBase, Zookeeper, PIG, Sqoop, and Flume.
- Strong experience in using Excel and MS Access to dump the data and analyze based on business needs and designed the Data Marts in dimensional data modeling using star and snowflake schemas.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Management Systems (RDBMS) and from RDBMS to HDFS.
- Well versed in Normalization / De - normalization techniques for optimum performance in relational and dimensional database environments.
- Experienced in working with Business Intelligence and Enterprise Data Warehouse (EDW) including SSAS, Pentaho, Cognos, OBIEE, QlikView, Greenplum, Amazon Redshift and Azure Data Warehouse.
- Experienced in various Teradata utilities like Fastload, Multiload, BTEQ, and Teradata SQL Assistant.
- Very good experience and knowledge on Amazon Web Services: AWS Redshift, AWS S3 and AWS EMR.
- Experienced in enhancing complex enterprise data models, accurately representing the logical and DB nuances in physical data models and designed data models (ER and dimensional) for database platforms such as Oracle, SQL Server, and DB2.
- Experienced in designing Star Schema, Snowflake schema for Data Warehouse, by using tools like Erwin data modeler, Power Designer and Embarcadero E-R Studio.
- Experienced in integration of various relational and non-relational sources such as DB2, Teradata, Oracle, Netezza, SQL Server, NoSQL, COBOL, XML and Flat Files, to Netezza database.
- Experience working with data modeling tools like Erwin, Power Designer and ER Studio.
- Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
- Experienced in Netezza tools and Utilities NzLoad, NzSql, NzPL/SQL, Sqltoolkits, Analytical functions etc.
- Experienced in data analysis using Hive, Pig Latin, and Impala and good understanding and hands on experience in setting up and maintaining NoSQL Databases like Cassandra, MongoDB, and HBase
- Expertise on Relational Data modeling (3NF) and Dimensional data modeling and expertise in designing Star schema, Snowflake schema for Data Warehouse, ODS architecture by using tools like Erwin data modeler, Power Designer, and E-R Studio.
- Experience in setting up connections to different RDBMS Databases like Oracle, SQL Server, DB2, Teradata according to users requirement.
- Good knowledge of Data Marts, Operational Data Store (ODS), Dimensional Data Modeling with Ralph Kimball Methodology using Analysis Services.
- Strong experience with architecting highly performance databases using PostgreSQL, PostGIS, MYSQL and Cassandra.
- Experience with SQL Server and T-SQL in constructing Temporary Tables, Table variables, Triggers, user functions, views, Stored Procedures.
- Good Knowledge in Amazon Elastic Compute Cloud (Amazon EC2 & S3) and MS Azure.
- Excellent understanding and working experience of industry standard methodologies like System Development Life Cycle (SDLC), as per Rational Unified Process (RUP), AGILE Methodologies.
- Hands on experience on tools like R, SQL, SAS and Tableau.
- Good knowledge Developing Informatica Mappings, Mapplets, Sessionss, Workflows and Worklets for data loads from various sources such as Oracle, Flat Files, DB2, SQL Server etc.
TECHNICAL SKILLS:
Data Modeling Tools: Erwin 9.7/9.6, Sybase Power Designer, Oracle Designer, ER/Studio V17
Big Data tools: Hadoop 3.0, HDFS, Hive 2.3, Pig 0.17, HBase 1.2, Sqoop 1.4, Flume 1.8, AWS, EC2, S3
Database Tools: Oracle 12c/11g, Teradata 15/14, Netezza, Microsoft SQL Server and MS Access, PostgreSQL.
Reporting tools: SQL Server Reporting Services (SSRS), Tableau, Crystal Reports, Business Objects, Micro Strategy, Business Objects 5.1, Cognos
ETL Tools: SSIS, Pentaho 6.1, Informatica v10.
Programming Languages: Java, Base SAS and SQL, T-SQL,HTML 5, Java Script, CSS, UNIX shells scripting and PL/SQL.
Operating Systems: Microsoft Windows 10/8/7, UNIX, and Linux
Tools: & Software: TOAD 6.2, MS Office 2016, BTEQ, Teradata SQL Assistant, MS-Office suite (Word, Excel, MS Project and Outlook)
Web technologies: HTML 5, DHTML, XML
Project Execution Methodologies: Ralph Kimball and Bill Inmon data warehousing methodology, RUP, JAD, Agile, Waterfall, and RAD
PROFESSIONAL EXPERIENCE:
Sr. Data Architect/Data Modeler
Confidential, Dallas TX
Responsibilities:
- Responsible for technical data governance, enterprise wide data modeling and database design and developed multi-dimension data models to support BI solutions developed as well as other common industry data from external systems.
- Working with business partners and team members, gather and analyze requirements translating these into solutions for database designs supporting transactional system data integration, reports, spreadsheets, and dashboards.
- Involved in Planning, Defining and Designing data base using Erwin on business requirement and provided documentation.
- Responsible for Big data initiatives and engagement including analysis, brainstorming, POC, and architecture.
- Loaded data into Hive Tables from Hadoop Distributed File System (HDFS) to provide SQL access on Hadoop data
- Created Complex SSAS cubes with multiple Fact and Measure groups, and multiple dimension hierarchies based on the OLAP reporting needs.
- Advanced information management and new data processing techniques may be applied to extract the value locked up in this data called Hadoop (HDFS) along with processing large data sets in parallel across a Hadoop cluster and the utilization of Hadoop MapReduce framework.
- Created SSAS tabular semantic model in Direct Query mode with multiple partitions, KPI's, hierarchies and calculated measures using DAX as per business requirements.
- Designed facts and dimension tables and defined relationship between facts and dimensions with Star Schema and Snowflake Schema in SSAS.
- Worked with project management, business teams and departments to assess and refine requirements to design/develop BI solutions using MS Azure.
- Researched and developed hosting solutions using Azure and other 3rd party hosting and software as a service solution.
- Used SQL on the new AWS Databases like Redshift and Relation Data Services and orked with various RDBMS like Oracle 11g, SQL Server, DB2 UDB, and Teradata 14.1, Netezza.
- Created SQL tables with referential integrity and developed SQL queries using SQL Server and Toad
- Worked with Azure Machine Learning, Azure Event Hubs, Azure Stream Analytics, PivotTables working with up to 140 million-record, multi-table, data sets in SQL (MS SQL Server, SAS Proc SQL, etc.)
- Created Tabular Data Models and implemented POWER BI for POC in Share Point Environment
- Partnered directly with the Data Architect, clients, ETL developers, other technical data warehouse team members and database administrators to design and develop high performing databases and maintain consistent data element definitions
- Involved with data profiling for multiple sources and answered complex business questions by providing data to business users.
- Created logical and physical data models using Erwin and reviewed these models with business team and data architecture team.
- Transformed Logical Data Model to Physical Data Model ensuring the Primary Key and Foreign key relationships in PDM, Consistency of definitions of Data Attributes and Primary Index considerations.
- Created SQL scripts to find data quality issues and to identify keys, data anomalies, and data validation issues.
- Responsible for full data loads from production to AWS Redshift staging environment and responsible for creating Hive tables, loading data and writing hive queries.
- Designed different type of STAR schemas for detailed data marts and plan data marts in the OLAP environment.
- Produced and enforced data standards and maintain a repository of data architecture artifacts and procedures.
- Provides architectures, patterns, tooling choices and standards for master data and hierarchy life cycle management.
Environment: Erwin 9.6, Informatica v10, Power Pivot, SQL, Microsoft Azure, MS Excel, MS Visio, Rational Rose, SSAS, Pig, Hive, CSV files, Hadoop, MongoDB, HBase, Sqoop, AWS S3, AWS EMR, AWS Redshift, Python, XML files, Linux, AWK, Aginity, Teradata SQL Assistant, Oracle12c.
Sr. Data Modeler/Data Architect
Confidential, Mentor OH
Responsibilities:
- Responsible for the data architecture design delivery, data model development, content creation, review, approval and used Agile Methodology for Data Warehouse development
- Working with Big Data Hadoop Ecosystem in ingestion, storage, querying, processing and analysis of big data and conventional RDBMS.
- Developed and automated multiple departmental Reports using Tableau and MS Excel.
- Responsible for all metadata relating to the EDW's overall data architecture, descriptions of data objects, access methods and security requirements
- Involved in relational and dimensional Data Modeling for creating Logical and Physical design of the database and ER diagrams using data modeling like Erwin.
- Designed the data marts using the Ralph Kimball's Dimensional Data Mart modeling methodology using ERWIN.
- Designed both OLTP and ODS databases for high performance using ERWIN modeling tool and worked on Normalization and De-Normalization techniques for both OLTP and OLAP systems.
- Involved in OLAP model based on Dimension and FACTS for efficient loads of data based on Star Schema structure on levels of reports using multi-dimensional models such as Star Schemas and Snowflake Schema
- Used Informatica to extract data from Workday to staging and then loading the data to EDW data warehouse and data mart, EDW built on Netezza database
- Established uniform Master Data Dictionary and Mapping rules for metadata, data mapping and lineage.
- Worked on SAS for DataAnalysis and involved in importing and cleansing of data from various sources like Teradata, Oracle, flatfiles, Netezza, SQLServerwith high volume data.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
- Worked with delivery of Data & Analytics applications involving structured and un-structured data on Hadoop based platforms on AWS EMR
- Designed and implemented Oracle PL/SQL store procedures, functions and packages for data manipulation and validation.
- Involved in all the steps and scope of the project reference data approach to MDM and Created Data Dictionary and Data Mapping from Sources to the Target in MDM Data Model.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Worked on building Aptitude Operational Data Store (ODS) model in an Oracle Ex-data database.
- Set up of environments to be used for testing and the range of functionalities to be tested as per technical specifications.
- Reviewed Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
- Created Complex SQL Queries using Views, Indexes, Triggers, Roles, Stored procedures and User Defined Functions worked with different methods of logging in SSIS.
- Automation of SSIS Packages for production deployment with xml configurations.
- Developed Historical/Incremental of SSIS Packages with SCD2 concept of Star Schema.
- Populate or refresh Teradata tables using Fast load, Multi load & Fast export utilities for user Acceptance testing and loading history data into Teradata.
Environment: Erwin 9.5, Oracle PL/SQL, SSIS, ODS, OLTP, Hadoop 3.2, HDFS, Oracle 12c, Sqoop 1.4, AWS, Agile, ETL & MDM, Oracle, MongoDB, HBase, SQL, Hive, Pig, MapReduce, AWS S3, AWS EMR, AWS Redshift, Sqoop, Kafka, Python, Spark and Teradata.
Sr. Data Modeler/Data Analyst
Confidential - Bethesda, MD
Responsibilities:
- Collaboratively worked with the Data modeling architects and other data modelers in the team to design the Enterprise Level Standard Data model.
- Built analytical data pipelines to port data in and out of Hadoop/HDFS from structured and unstructured sources and designed and implemented system architecture for Amazon EC2 based cloud-hosted solution for client.
- Worked with Business Analysts team in requirements gathering and in preparing functional specifications and translating them to technical specifications.
- Extensively involved in developing logical and physical data models that fit in exactly for the current state and the future state data elements and data flows using ER/Studio.
- Documenting business names and descriptions for tables and columns, relationships among datasets and data types, business rules and domains used ER/Studio.
- Using Kettle Scripting/ Pentaho processed data from one table and replaced some values, filtered some values and syncing two data sources.
- Utilized existing Informatica, Teradata, and SQL Server to deliver work and fix production issues on time in fast paced environment.
- Responsible for full data loads from production to AWS Redshift staging environment and Worked on migrating of EDW to AWS using EMR and various other technologies.
- Defined the key columns for the Dimension and Fact tables of both the Warehouse and Data Mart.
- Interacted with the Business Users for gathering data design requirements and taking feedback on improvements and conducted and participated JAD sessions with the Project managers, Business Analysis Team, analyze and document the Business and reporting requirements.
- Conduct Design discussions and meetings to come out with the appropriate Data Mart at the lowest level of grain for each of the Dimensions involved.
- Created Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data. Created various types of data visualizations using Python and Tableau.
- Designed a Star schema for the detailed data marts and Plan data marts involving shared dimensions.
- Extensively used Star Schema methodologies in building and designing the logical data model into Dimensional Models.
- Created and maintained Logical Data Model (LDM) for the project. Includes documentation of all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, glossary terms, etc.
- Designed Normalized data up to 3rd Normal form and Participated in brain storming sessions with application developers and DBAs to discuss about various de-normalization, Partitioning and Indexing Schemes for Physical Model.
- Worked with data owners/stewards to ensure awareness of data quality standards and monitoring requirements.
- Created Hive architecture used for real time monitoring and HBase used for reporting and worked for map reduce and query optimization for Hadoop hive and HBase architecture.
- Validated and updated the appropriate LDM's to process mappings, screen designs, use cases, business object model, and system object model as they evolve and change.
- Conducted Design reviews with the business analysts and content developers to create a proof of concept for the reports and ensured the feasibility of the logical and physical design models.
- Collaborated with the Reporting Team to design Monthly Summary Level Cubes to support the further aggregated level of detailed reports.
- Designed Data Flow Diagrams, E/R Diagrams and enforced all referential integrity constraints.
- Worked on designing Conceptual, Logical and Physical data models and performed data design reviews with the Project team members.
- Worked on SQL Server concepts SSIS (SQL Server Integration Services), SSAS (Analysis Services) and SSRS (Reporting Services).
- Involved in Teradata utilities (BTEQ, Fast Load, Fast Export, Multiload, and Tpump) in both Windows and Mainframe platforms.
- Used SQL to run ad-hoc queries and prepare reports to the management and implemented the model and communicated it to end-users and stakeholders.
- Created stored procedures using PL/SQL and tuned the databases and backend process.
Environment: ER/Studio, OLTP, SQL, SSIS, SSAS, SSRS, PL/SQL and 3NF, Hadoop, Hive, Pig, MapReduce, MongoDB, HBase, AWS S3, AWS Redshift, Python, BigData, Spark, XML, Tableau, SSRS, Teradata, Netezza and Teradata SQL Assistance.
Data Analyst/Data Modeler
Confidential - Minneapolis, MN
Responsibilities:
- Identified and compiled common business terms for the new policy generating system and also worked on contract Subject Area.
- Maintained the stage and production conceptual, logical, and physical data models along with related documentation for a large data warehouse project.
- Assisted in migration of data models from Oracle Designer to Erwin and updating the data models to correspond to the existing database structures.
- Performed data analysis using SQL queries on source systems to identify data discrepancies and determine data quality.
- Served as a resource for analytical services utilizing SQL Server and TOAD/Oracle and Created SQL queries using TOAD and SQL Navigator and also created various databases object stored procedure, tables, views.
- Updated or Created new data models for each release and generated database scripts from Erwin.
- Used Erwin to create report templates. Maintained and changed the report templates as needed to generate varying data dictionary formats as contract deliverables.
- Worked with the DBAs on maintaining existing, and creating new, database Tools & Technologies and communicate to leadership within our company to convey the importance and business value of data descriptions for entities, attributes, domain values and relationships using data modeling tools to create Entity Relationship Diagrams.
- Created Data stage jobs (ETL Process) for populating the data into the Data warehouse constantly from different source systems.
- Analyzed and designed the business rules for data cleansing that are required by the staging and OLAP & OLTP database.
- Updated or worked with test team members to help them understand data changes and write test cases.
- Identified and documented data sources and transformation rules required to populate and maintain data warehouse content.
- Responsible for indexing the tables in the data warehouse and performed data modeling within information areas across the enterprise including data cleansing and data quality using data modeling methods and processes.
- Involved in logical and Physical Database design & development, Normalization and Data modeling using Erwin and SQL Server Enterprise manager.
- Created a high-level industry standard, generalized data model to convert it into Logical and Physical model at later stages of the project using Erwin and Visio.
- Generated XMLs from the Erwin to be loaded into MDR (metadata repository)
- Conducted design walk through sessions with Business Intelligence team to ensure that reporting requirements are met for the business.
- Wrote SQL scripts for creating tables, Sequences, Triggers, views and materialized views.
- Designed Data Flow Diagrams, E/R Diagrams and enforced all referential integrity constraints.
- Developed and maintains data models and data dictionaries, data maps and other artifacts across the organization, including the conceptual and physical models, as well as metadata repository
- Performed extensive Data Validation, Data Verification against Data Warehouse and performed debugging of the SQL-Statements and stored procedures for business scenarios.
Environment: Erwin9.0, SQL Server2008, MS Visio, Oracle10g, UNIX, Linux, TOAD, SQL, PL/SQL, Oracle, Teradata, Netezza, XML, Informatica, Tableau, MS Visio, MDM, PowerBI.
Data Analyst
Confidential
Responsibilities:
- Developed Data Mapping, Data Governance and transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Created new conceptual, logical and physical data models using ERWin and reviewed these models with application team and modeling team.
- Performed numerous data pulling requests using SQL for analysis and created databases for OLAP Metadata catalog tables using forward engineering of models in Erwin.
- Enforced referential integrity in the OLTP data model for consistent relationship between tables and efficient database design.
- Proficient in importing/exporting large amounts of data from files to Teradata and vice versa.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Identified and tracked the slowly changing dimensions, heterogeneous sources and determined the hierarchies in dimensions.
- Utilized ODBC for connectivity to Teradata & MSExcel for automating reports and graphical representation of data to the Business and Operational Analysts.
- Extracted data from existing data source, Developing and executing departmental reports for performance and response purposes by using oracle SQL, MS Excel.
- Extracted data from existing data source and performed ad-hoc queries and used BETQ to run and Teradata SQL scripts to create physical data model.
Environment: UNIX scripting, Oracle SQL Developer, SSRS, SSIS, Teradata, Windows XP, SAS data sets.
