Sr Data Architect/data Modeler Resume
Austin, TX
SUMMARY
- Overall 10+ year of Senior Data Architect/Modeler/Analyst with IT professional experienced in Data Analysis, Data Modeling, Data Architecture, designing, developing, and implementing data models for enterprise - level applications and systems.
- Experience in analyzing data using Hadoop Ecosystem including HDFS, Hive, Spark, Spark Streaming, Elastic Search, Kibana, Kafka, HBase, Zookeeper, PIG, Sqoop, Flume.
- Experienced in designing the Conceptual, Logical and Physical data modeling, E-R diagrams and modeling Transactional Databases and Data Warehouse using tools like Erwin, ER/Studio and Sybase Power Designer.
- Experience in Netezza to Oracle Shell script for loading tables which are required by QA Tools from Netezza in to the Oracle.
- Knowledge on Amazon EC2, Amazon S3, Amazon RDS, and Elastic Load Balancing services of AWS infrastructure.
- Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
- Excellent understanding of Hub Architecture Style for MDM hubs the registry, repository and hybrid approach.
- Exposure on Cognos development Software development kit like SDK in order to customize reports.
- Experienced on projects utilizing Hadoop eco systems, and other ecosystem tools like HDFS, Map Reduce, Sqoop, Hive, Pig, Flume, Oozie.
- Knowledge on Autosys job scheduler, DevOps/CICD systems, Version control, build/deploy tools.
- Strong experience with architecting highly performs databases using PostgreSQL, PostGIS, MYSQL and Cassandra.
- Experienced to work as Data Analyst to perform complex Data Profiling, Data Definition, Data Mining, data analytics, validating and analyzing data and presenting reports.
- Extensive working experience on in bug tracking Tools usage -HP ALM Quality center, JIRA.
- Excellent experience in troubleshooting test scripts, SQL queries, ETL jobs, data warehouse/data mart/data store models.
- Worked with XML and Flat file sources coming from various legacy source systems and residing on Mainframe and UNIX.
- Experienced in writing UNIX shell scripting and hands on experienced with scheduling of shell scripts using Control-M.
- Excellent in performing data transfer activities between SAS and various databases and data file formats like XLS, CSV, DBF, MDB etc.
- Experienced in Data Analysis and Data Profiling using complex SQL on various sources systems.
- Experience in conducting Joint Application Development (JAD) sessions with SMEs, Stakeholders and other project team members for requirements gathering and analysis.
- Experience in deep understanding of relational and dimensional databases, Business Intelligence and Data Warehouse concepts
- Comprehensive knowledge in Slowly Changing Dimensions - Type I, II, III in Dimension tables
- Extensive experience on usage of ETL & Reporting tools like SQL Server Integration Services (SSIS) and SQL Server Reporting Services (SSRS)
- Strong experience in writing SQL and PL/SQL, T-SQL, Transact SQL programs for Stored Procedures, Triggers and Functions.
- Experience in integration of various relational and non-relational sources such as DB2, Teradata, Oracle, Netezza, SQL Server, NoSQL database.
- Proven experience completing deliverable such as data models, data flow diagrams and data value maps with minimal supervision
- Experience in designing Enterprise Data Warehouses, Data Marts, Reporting data stores (RDS) and Operational data stores (ODS).
- Hands on experience in Normalization & De-normalization techniques design considerations upto 3NF Form for OLTP Databases and Models.
- Experience in various Teradata utilities like Fastload, Multiload, BTEQ, and Teradata SQL Assistant.
- Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
- Strong experience in Data Migration, Transformation, Integration, Data Import, and Data Export.
- Experience in designing Star schema, Snowflake schema for Data Warehouse, ODS architecture.
- Experienced in Netezza to Oracle Shell script for loading tables which are required by QA Tools from Netezza in to the Oracle.
- Excellent communication skills, especially the ability to actively listen and draw out the true needs rather than the stated wants of any given stakeholder
TECHNICAL SKILLS
DataModeling Tools: Erwin 9.6.4/9.6/9.5 , ER Studio9.7, Sybase Power Designer.
Databases: Oracle 12c/11g, Teradata R15/R14, MS SQL Server2014/2012, MS Access, Netezza
Programming Languages: SQL, PL/SQL, UNIX shell Scripting, PERL, AWK, SED.
Big Data Technologies: Hadoop, Hive, HDFS, HBase, Sqoop, Spark, MapReduce, Pig, Impala, Flume.
BI Tools: Tableau, Tableau server, Tableau Reader, SAP Business Objects
Operating System: Windows, Unix, Sun Solaris
ETL/Datawarehouse Tools: Informatica 9.6/9.1, SAP Business Objects XIR3.1/XIR2, Web Intelligence, Talend, Tableau, Pentaho
Tools: & Software: MS Office, BTEQ, Teradata SQL Assistant
Project Execution Methodologies: Agile, Ralph Kimball and Bill Inmondatawarehousing methodology, Rational Unified Process (RUP), Rapid Application Development (RAD), Joint Application Development (JAD)
Testing and defect tracking Tools: HP/Mercury (Quality Center, Win Runner, Quick Test Professional, Performance Center, Requisite, MS Visio & Visual Source Safe
PROFESSIONAL EXPERIENCE
Confidential - Austin TX
Sr Data Architect/Data Modeler
Responsibilities:
- Working as a Data Modeler/Architect to generate Data Models using Erwin and developed relational database system.
- Also involved in Data Architect role to review business requirement and compose source to target data mapping documents.
- Implemented Security with LDAP and Group level security in Cognos Connection Portal.
- Researched, evaluated, architect, and deployed new tools, frameworks, and patterns to build sustainable Big Data platforms for our clients
- Used Teradata Administrator and Teradata Manager Tools for monitoring and control the system.
- Designed & developed the reports using Cognos 8 Report Studio, Query Studio, Analysis Studio reports.
- Participated in Rapid Application Development and Agile processes to deliver new cloud platform services.
- Designed, developed data warehouses, dashboards, data pipelines and reporting tools for operational and business impact data.
- Maintained and assisted in the development of moderately complex business solutions, which included data, reporting, business intelligence/analytics.
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
- Collected large amounts of log data using Apache Flume and aggregating using PIG/HIVE in HDFS for further analysis
- Installed patch sets and upgraded Teradata.
- Led database marketing and analytics firm focused on enhancing a client’s ROI and customer retention.
- Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources.
- Designed the Logical Data Model using ERWIN 9.6.4 with the entities and attributes for each subject areas.
- Created MDM base objects, Landing and Staging tables to follow the comprehensive data model in MDM.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Worked on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
- Involved in designing Logical and Physical data models for different database applications using the Erwin 9.6.4
- Created the BTEQ and MLOAD scripts to load the data from Hadoop to Teradata Target system by making use of Sqoop utility.
- Responsible for data profiling and data quality checks to suffice the report requirements gathered above and provide an ETL Mapping.
- Advises on and enforces data governance to improve the quality/integrity of data and oversight on the collection and management of operational data.
- Used ETL methodology for supporting data extraction, transformations and loading processing, in a complex MDM using Informatica.
- Worked as an administrator on Cognos suit of products.
- Connected to Amazon Redshift through Tableau to extract live data for real time analysis
- Created SSIS Packages for import and export of data between database and Flat Files.
- Worked on Metadata Repository(MRM) for maintaining the definitions and mapping rules up to mark
- Designed the ODS with core tables and now working on enhancing this model for additional master data.
- Developed triggers, stored procedures, functions and packages using cursors and ref cursor concepts associated with the project using PL/SQL.
- Generated periodic reports based on the statistical analysis of the data using SQL Server Reporting Services (SSRS).
- Worked with the QA team member to understand the test coverage on the functionalities, identify missing test cases, help QA to build strong test suite for the data pipelines.
Environment: Erwin 9.6.4, AWS Redshift,Cognos, Analytics,MapReduce, ODS, MDM, OLAP, OLTP, Teradata15, Hadoop, HDFS, Sqoop, Hive, NoSQL, Teradata R15, Netezza, PL/SQL, MS Visio, T-SQL, SSIS, SQL, Unix, Tableau.
Confidential - Newport Beach, CA
Sr. Data Architect/Data Modeler
Responsibilities:
- Lead the design and modeling of tactical architectures for development, delivery, and support of projects.
- Identified security loopholes, established data quality assurance and addressed data governance.
- Developing full life cycle software including defining requirements, prototyping, designing, coding, testing and maintaining software.
- Responsible for Master Data Management (MDM) and Data Lake design and architecture. Data Lake is built using Cloudera Hadoop.
- Worked as Data Analyst to perform complex Data Profiling, Data Definition, Data Mining, data analytics, validating and analyzing data and presenting reports.
- Involved in Normalization and De-Normalization of existing tables for faster query retrieval.
- Used data virtualization tool connect multiple heterogeneous sources without requirement of physically moving the data..
- Worked in Agile Environment using tolls like JIRA and Version One.
- Designed the schema, configured and deployed AWS Redshift for optimal storage and fast retrieval of data.
- Performed POC for Big data solution using Cloudera Hadoop for data loading and data querying
- Extensively involved in analyzing various data formats using industry standard tools and effectively communicate them with business users and SME's.
- Involved with data profiling for multiple sources and answered complex business questions by providing data to business users.
- Involved in Netezza Administration Activities like backup/restore, performance tuning, and Security configuration.
- Designed Physical Data Model (PDM) using Erwin 9.5 Data Architect data modeling tool and Oracle PL/SQL.
- Customized Cognos Connection with appropriate reports and security measures.
- Wrote Pig Scripts to generate MapReduce jobs and performed ETL procedures on the data in HDFS.
- Designed and developed Use Cases, Activity Diagrams, Sequence Diagrams, OOD (Object oriented Design) using UML and Visio.
- Conducted Oracle database analytics using TOAD and other BI tools, Tableau to resolve customer services issues.
- Used forward engineering to create a physical data model with DDL that best suits the requirements from the Logical Data Model.
- Used Erwin for reverse engineering to connect to existing database and ODS to create graphical representation in the form of Entity Relationships and elicit more information.
- Involved in Logical modeling using the Dimensional Modeling techniques such as Star Schema and Snow Flake Schema.
- Worked for map reduce and query optimization for Hadoop hive and HBase architecture
- Involved in creating Data Lake by extracting customer's Big Data from various data sources into HDFS
- Created SSIS Reusable Packages to extract data from Multi formatted Flat files, Excel, XML files into UL Database.
- Transferred technical knowledge specific to Cognos tools and the existing data model to other team members.
- Implemented Python scripts to import/export JSON file, which contains the customer survey information and/or asset information, to/from the database.
- Involved in all the steps and scope of the project data approach to MDM, Creating a Data Dictionary and Mapping from Sources to the Target in MDM Data Model.
- Responsible for full data loads from production to AWS Redshift staging environment.
- Developed Ad-Hoc Queries, Views and functions in Greenplum in order to make data accessible for Business Analyst and Managers.
- Performed data analysis, statistical analysis, generated reports, listings and graphs using SAS tools, SAS Integration Studio, SAS/Graph, SAS/SQL, SAS/Connect and SAS/Access
Environment: Erwin 9.5, Hadoop, HBase, Cognos, HDFS, MapReduce, Python, PL/SQL, ODS, OLAP, OLTP, JSON, Greenplum, flat files, Oracle 11g, MDM, ODS, Informatica, SAS, SSIS.
Confidential - St. Louis, MO
Sr. Data Architect/Data Modeler
Responsibilities:
- Provide data architecture support to enterprise data management efforts, such as the development of the enterprise data model
- Involved in developing Database Design Document including Data Model Conceptual, Logical and Physical Models using Erwin 9.6.
- Reviewed the Logical Model with Application Developers, ETL Team, DBAs and Testing Team to provide information about the Data Model and business requirements.
- Worked on framework manager to import metadata from multiple data sources and to create subject oriented business models, create and publish packages to Cognos server.
- Used Agile Methodology of Data Warehouse development using Kanbanize.
- Normalized the tables/relationships to arrive at effective Relational Schemas without any redundancies.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snow Flake Schemas
- Define data integration needs, rules, data governance framework, data quality framework for MDM program
- Loaded data into Hive Tables from Hadoop Distributed File System (HDFS) to provide SQL access on Hadoop data
- Applied Data Governance rules (primary qualifier, class words and valid abbreviation in Table name and Column names).
- Specifies overall Data Architecture for all areas and domains of the enterprise, including Data Acquisition, ODS, MDM, Data Warehouse, Data Provisioning, ETL, and BI.
- Developed long term data warehouse roadmap and architectures, designs and builds the data warehouse framework per the roadmap.
- Worked on Tableau 9.0 for insight reporting and data visualization
- Worked with Data architect and solution architects to create the Conceptual data model by breaking the requirements into different subject areas.
- Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
- Developed SSIS packages to migrate data from retiring systems to archived environment.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
- Completed enhancement for MDM (Master data management) and suggested the implementation for hybrid MDM.
- Involved in debugging and Tuning the PL/SQL code, tuning queries, optimization for the Oracle and DB2 database.
- Extensively used Metadata & Data Dictionary Management, Data Profiling, Data Mapping.
- Identified Facts & Dimensions Tables and established the Grain of Fact for Dimensional Models.
- Independently code new programs and design Tables to load and test the program effectively for the given POC's using with Big Data/Hadoop along with the following technical skills like Hive, HDFS, Impala, Spark, Cloudera Manager, Cloudera Navigator to deliver complex systems issues or changes.
- Extracting Mega Data from Amazon Redshift, AWS, and Elastic Search engine using SQL Queries to create reports
- Used Python scripts to update the content in database and manipulate files
- Involved in Java code, which generated XML document, which in turn used XSLT to translate the content into HTML to present to GUI.
- Involved in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
Environment: Erwin 9.6, Hadoop, HDFS, AWS, Sqoop, Java, Cognos, SSIS, HBase, MapReduce, Hive, Impala, MDM, Spark, OLAP, OLTP, ODS, Python, PL/SQL, Data Flux, Oracle 12c, Flat Files, DB2, Amazon Redshift
Confidential, Brentwood, TN
Sr. Data Architect/Data Modeler
Responsibilities:
- Responsible for the data architecture design delivery, data model development, review, approval and Data warehouse implementation.
- Used the Agile Scrum methodology to build the different phases of Software development life cycle
- Involved in Normalization/De-normalization techniques for optimum performance in relational and dimensional database environments.
- Installation and Configuration of other Open Source Software like Pig, Hive, HBase, Flume and Sqoop.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Worked with delivery of Data & Analytics applications involving structured and un-structured data on Hadoop based platforms on AWS EMR
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Provided suggestion to implement multitasking for existing Hive Architecture in Hadoop. Also suggested UI customization in Hadoop
- Developed and implemented data cleansing, data security, data profiling and data monitoring processes.
- Created DDL scripts using Erwin and source to target mappings to bring the data from source to the warehouse.
- Performing Source System Analysis, database design, data modeling for the warehouse layer using MLDM concepts and package layer using Dimensional modeling
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Co-ordinate all teams to centralize Metadata management updates and follow the standard Naming Standards and Attributes Standards for DATA &ETL Jobs.
- Involved in Teradata utilities (BTEQ, Fast Load, Fast Export, Multiload, and Tpump) in both Windows and Mainframe platforms.
- Created Hive architecture used for real time monitoring and HBase used for reporting
- Responsible for Metadata Management, keeping up to date centralized metadata repositories using Erwin modeling tools.
- Developed various QlikView Data Models by extracting and using the data from various sources files Excel, Flat Files and Big data.
- Produced PL/SQL statement and stored procedures in SQl for extracting as well as writing data.
- Developed data Mart for the base data in Star Schema, Snow-Flake Schema involved in developing the data warehouse for the database.
- Used SSRS for generating Reports from Databases and Generated Sub-Reports, Drill down reports, Drill through reports and parameterized reports using SSRS.
- Primarily responsible for Tableau customization for statistical dashboard to monitor sales effectiveness and also used Tableau for customer marketing data visualization.
Environment: Erwin 9.5, Oracle 11g, Pig, HDFS, Flume, Sqoop, Teradata 14, PL/SQL, Data Mart, UNIX, Tableau, T-SQL, Hadoop, HBase, ODS, OLAP, OLTP, Flat Files, Hive, UNIX, DBA.
Confidential - Erlanger, KY
Sr. Data Analyst/Data Modeler
Responsibilities:
- Gathered and Translated business requirements into detailed, production-level technical specifications, new features, and enhancements to existing technical business functionality
- Worked with ER/Studio to create business glossaries with consistent terms and to definitions to ensure that data models and databases are aligned to an organizational usage.
- Involved in Dimensional modeling (Star Schema) of the Data warehouse and used ER/Studio to design the business process, dimensions and measured facts.
- Created data flow diagrams and process flow diagrams for various load components like FTP Load, SQL Loader Load, ETL process and various other processes that required transformation.
- Developed best practices for standard naming conventions and coding practices to ensure consistency of data models.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture
- Involved in data analysis and modeling for the OLAP and OLTP environment.
- Developed Source to Target Data Mapping, Data Profiling, Transformation and Cleansing rules for OLTP and OLAP.
- Evaluated data models and physical databases for variances and discrepancies.
- Involved in logical and physical database designs and data mappings between cross functional application and used SQL queries to filter data.
- Performed in team responsible for the analysis of business requirements and design implementation of the business solution.
- Used extensively Base SAS, SAS/Macro, SAS/SQL, and Excel to develop codes and generated various analytical reports.
- Involved in administrative tasks, including creation of database objects such as database, tables, and views, using SQL, DDL, and DML requests.
- Developed and maintained data dictionary to create metadata reports for technical and business purpose.
- Responsible for tuning ETL procedures and STAR schemas to optimize load and query Performance by using various data and index caches
- Loaded multi formatdatafrom various sources like flat-file, Excel, MS Access and performing file system operation.
- Created complex Stored Procedures and PL/SQL blocks with optimum performance using Bulk Binds (BULK COLLECT & FOR ALL), Inline views, cursors, cursor variables, dynamic SQL, v-arrays, external tables, nested tables, etc.
- Conducted JAD Sessions with the SME, stakeholders and other management teams in the finalization of the User Requirement Documentation.
- Developed dimensional model for Data Warehouse/OLAP applications by identifying required facts and dimensions.
- Developed the code as per the client's requirements using SQL, PL/SQL and Data Warehousing concepts.
- Created SSIS packages to export data from text file to SQL Server Database.
- Created various type of reports such as drill down & drill through reports, Matrix reports, Sub reports and Charts using SQL Server Reporting Services (SSRS).
- Wrote T-SQL statements for retrieval of data and Involved in performance tuning of T-SQL queries and Stored Procedures.
- Worked on database design, relational integrity constraints, OLAP, OLTP, Cubes and Normalization (3NF) & De-normalization of database.
- Conducted data modeling JAD sessions and communicated data-related standards.
- Created DDL scripts to create database or make modifications to it and reviewed with the Dev team and DBA.
- Redefined many attributes and relationships in the reverse engineered model and cleansed unwanted tables/columns as part of data analysis responsibilities
- Involved using ETL tool Informatica to populate the database, data transformation from the old database to the new database using Oracle and SQL Server.
- Involved in the creation, maintenance of Data Warehouse and repositories containing Metadata.
Environment: ER/Studio 8.0, SQL Server 2005, Oracle 9i, Microsoft Access, SSIS, MS PowerPoint, MS Access, Metadata, T-SQL, PL/SQL, OLAP, OLTP, Informatica, Microsoft Excel.
Confidential - Northbrook, IL
Data Analyst/Data Modeler
Responsibilities:
- Facilitated development, testing and maintenance of quality guidelines and procedures along with necessary documentation.
- Worked closely in conjunction with business (SME's) and application data stewards to get business responses and source-to-target mapping clarifications.
- Developed normalized Logical and Physical database models to design OLTP system for insurance applications.
- Created dimensional model for the reporting system by identifying required dimensions and facts using Erwin r7.1.
- Used forward engineering to create a Physical Data Model with DDL that best suits the requirements from the Logical Data Model.
- Worked with Database Administrators, Business Analysts and Content Developers to conduct design reviews and validate the developed models.
- Identified, formulated and documented detailed business rules and Use Cases based on requirements analysis.
- Wrote and executed unit, system, integration and UAT scripts in a data warehouse projects.
- Generated DDL by forward engineering of the OLTP data model using Erwin.
- Exhaustively collected business and technical metadata and maintained naming standards.
- Informatica Designer, Workflow Manager and Repository Manager to create source and target definition, design mappings, create repositories and establish users, groups and their privileges.
- Involved in preparing Business model diagrams, flowcharts, process/work-flow diagrams, data flow and relationship diagrams using MS-Visio, ERWIN and data-models showing process mappings, screen designs, use case diagrams.
- Wrote T-SQL statement and stored procedures and worked extensively on SQL querying using Joins, Alias, Functions, Triggers, Views and Indexes.
- Performed Data mapping between source systems to Target systems, logical data modeling
- Created class diagrams and ER diagrams and used SQL queries to filter data.
- Involved in analysis, profiling, cleansing of source data and understanding the business process of the data by applying different transformation rules before transforming them to Data Warehouse.
- Worked with configuring checkpoints, package logging, error logging and event handling to redirect error rows and fix the errors in SSIS.
- Involved in Data extraction/Transformation/Cleaning/Loading has been implemented using SQL Loader and PL/SQl
- Involved in documentation of Data Mapping and ETL specifications for development from source to target mapping.
- Developed and Generated Custom Reports and Periodic Reports for conducting wide range statistical analysis using SSRS.
- Created SSRS reports using complex SQL Queries/Stored Process which have sub-reports, Drill-Down reports and charts.
Environment: Erwin 7.2, Oracle 8i, SQL Server 2000,, Microsoft Access, OLAP, OLTP, T-SQL, SSIS, SSRS, MS Visio, MS PowerPoint, MS Access,, Microsoft Excel