Sr. Data Architect / Modeler Resume
Phoenix, AZ
SUMMARY:
- Highly effective Data Architect with over 9 years of experience specializing in working with big data, cloud, data and analytics platforms.
- Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
- Excellent experience on Teradata SQL queries, Teradata Indexes, Utilities such as Mload, Tpump, Fast load and Fast Export.
- Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server 2008 and Teradata.
- Experience in Architecture, Design and Development of large Enterprise Data Warehouse (EDW) and Data - marts for target user-base consumption.
- Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
- Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and scheduling.
- Strong experience in using Excel and MS Access to dump the data and analyze based on business needs.
- Expertise lies in Data Modeling, Database design and implementation of Oracle, AWS Redshift databases and Administration, Performance tuning etc.
- Experience in analyzing data using Hadoop Ecosystem including HDFS, Hive, Spark, Spark Streaming, Elastic Search, Kibana, Kafka, HBase, Zookeeper, PIG, Sqoop, Flume.
- Experienced working with Excel Pivot and VBA macros for various business scenarios.
- Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export .
- Data Transformation using Pig scripts in AWS EMR, AWS RDS.
- Experience working with data modeling tools like Erwin, Power Designer and ER Studio.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS.
- Experience in data analysis using Hive, PigLatin, Impala.
- Well versed in Normalization / De-normalization techniques for optimum performance in relational and dimensional database environments.
- Good understanding of AWS, big data concepts and Hadoop ecosystem.
- Experienced in various Teradata utilities like Fastload, Multiload, BTEQ, and Teradata SQL Assistant.
- Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server 2008 and Teradata.
- Develop and manage SQL, Python and R code bases fordatacleansing anddataanalysis using Git version control
- Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and scheduling.
- Extensive ETL testing experience using Informatica 8.6.1/8.1 (Power Center/ Power Mart) (Designer, Workflow Manager, Workflow Monitor and Server Manager)
- Excellent in creating various artifacts for projects which include specification documents, data mapping and data analysis documents.
- An excellent team player& technically strong person who has capability to work with business users, project managers, team leads, architects and peers, thus maintaining healthy environment in the project.
TECHNICAL SKILLS:
Analysis and Modeling Tools: IBM Infosphere, SQL Power Architect, Oracle Designer, Erwin 9.6/9.5, ER/Studio 9.7, Sybase Power Designer.
Database Tools: Oracle 12c/11g, MS Access, Microsoft SQL Server 2014/2012 Teradata 15/14, Poster SQL, Netezza.
Big Data Technologies: Hadoop, HDFS 2, Hive, Pig, HBase, Sqoop, Flume.
Cloud Platform: AWS, EC2, S3, SQS, Azure.
OLAP Tools: Business Objects, Tableau, SAP BO, SSAS, Crystal Reports 9.
Operating System: Windows, Dos, Unix, Linux.
Reporting Tools: Business Objects, Crystal Reports.
Tools: & Software s: TOAD, MS Office, BTEQ, Teradata SQL Assistant.
ETL Tools: SSIS, Pentaho, Informatica Power 9.6, SAP Business Objects XIR3.1/XIR2, Web Intelligence.
Other tools: TOAD, SQL PLUS, SQL LOADER, MS Project, MS Visio and MS Office, Have worked on C++, UNIX, PL/SQL etc.
PROFESSIONAL EXPERIENCE:
Confidential, Phoenix, AZ
Sr. Data Architect / Modeler
Responsibilities:
- Consulted and supported Data Architect / Data Modeler initiatives in the development of integrated data repository transformed from legacy system to new operational system and data warehouse.
- Providing solutions on ingesting the data into the new Hadoop big data platform by designing data models for multiple features to help analyze the data on graph databases.
- Applied business rules in modeling Data Marts and data profiling to model and new data structures.
- Delivered scope, requirements, and design for transactional and data warehouse system which included Oracle DB, SQL server, and Salesforce database.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
- Designed ODS structures and data mart structures.
- Developed long term data warehouse roadmap and architectures, designs and builds the data warehouse framework per the roadmap.
- Designed the Logical Data Model using ERWIN 9.64 with the entities and attributes for each subject areas.
- Extracted data from a transactional system into a staging area that transformed and loaded into a star schema.
- Working on AWS and architecting a solution to load data create data models and run BI on it.
- Working on logical and physical modeling, and ETL design for manufacturingdatawarehouse applications.
- Involved in creating Hive tables, and loading and analyzing data using hive queries Developed Hive queries to process the data and generate the data cubes for visualizing Implemented.
- Used Profisee as it supports real-time bi-directional integrations
- Implemented Join optimizations in Pig using Skewed and Merge joins for large datasets schema.
- Designed and developed a Data Lake using Hadoop for processing raw and processed claims via Hive and Informatica.
- Developed and implemented different Pig UDFs to write ad-hoc and scheduled reports as required by the Business team.
- Involved in Normalization / De normalization techniques for optimum performance in relational and dimensional database environments.
- Design of Big Data platform technology architecture. The scope includes data intake, data staging, data warehousing, and high performance analytics environment.
- Involved in loading data from LINUX file system to HDFS Importing and exporting data into HDFS and Hive using Sqoop Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Used SSRS to create reports, customized Reports, on-demand reports, ad-hoc reports and involved in analyzing multi-dimensional reports in SSRS.
- Creating dimensionaldatamodels based on hierarchical sourcedataand implemented on Teradata achieving high performance without special tuning.
- Focused on architecting NoSQL databases like Mongo, Cassandra and Cache database.
- Perform routine management operations, including configuration and performance analysis for mongodb. Diagnosing Performance Issues for mongodb.
- Involved in designing Logical and Physical data models for different database applications using the Erwin.
- Data modeling, Design, implement, and deploy high-performance, custom applications at scale on Hadoop /Spark.
- Implemented Data Integrity and Data Quality checks in Hadoop using Hive and Linux scripts.
- Reverse engineered some of the databases using Erwin.
- Proficiency in SQL across a number of dialects (we commonly write MySQL, PostgreSQL, Redshift, SQL Server, and Oracle).
- Routinely deal in with large internal and vendor data and perform performance tuning, query optimizations and production support forSAS, Oracle 12c.
- Working on definingdataarchitecture fordatawarehouses,Datamarts and business applications.
- Designed and developed architecture fordataservices ecosystem spanning Relational, NoSQL, and BigDatatechnologies.
- Specifies overall Data Architecture for all areas and domains of the enterprise, including Data Acquisition, ODS, MDM, Data Warehouse, Data Provisioning, ETL, and BI.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Performance tuning and stress-testing of NoSQL database environments in order to ensure acceptable database performance in production mode.
- Implemented strong referential integrity and auditing by the use of triggers and SQL Scripts.
- Designed and developed T-SQL stored procedures to extract, aggregate, transform, and insert data.
- Created and maintained SQL Server scheduled jobs, executing stored procedures for the purpose of extracting data from DB2 into SQL Server.
- Experience with SQL Server Reporting Services (SSRS) to author, manage, and deliver both paper-based and interactive Web-based reports.
- Performed Hive programming for applications that were migrated to big data using Hadoop
- Generated parameterized queries for generating tabular reports using global variables, expressions, functions, and stored procedures using SSRS.
- Extensive knowledge in Data loading using PL/ SQL Scripts and SQL Server Integration Services (SSIS).
- Work in team using ETL tool Informatica to populate the database, data transformation from the old database to the new database using Oracle and SQL Server.
- Ensured that data architecture tasks were executed within deadlines.
Environment: DB2, CA Erwin 9.6, Oracle 12c, Salesforce, MS-Office, SQL Architect, TOAD Benchmark Factory, SQL Loader, PL/SQL, SharePoint, ERwin r9.64, Talend, MS-Office, SQL Server 2008/2012, Hive, Pig, Hadoop, Spark, AWS.
Confidential, Newark NJ
Data Architect /Modeler
Responsibilities:
- Developed and maintained the data definitions, data models, data flow diagrams, metadata management, business semantics, and metadata workflow management.
- Integrated 40 data sources in one data repository utilizing modeling tools (ER Studio) and ETL tool (PL/SQL).
- Involve in data cleaning procedure by removing old, corrupted or irrelevant data in consultation with the teams.
- Worked with Big Data Hadoop Ecosystem in ingestion, storage, querying, processing and analysis of big data and conventional RDBMS.
- Involved in Relational and Dimensional Data modeling for creating Logical and Physical Design of Database and ER Diagrams with all related entities and relationship with each entity based on the rules provided by the business manager using ER Studio.
- Worked on Normalization and De-normalization concepts and design methodologies like Ralph Kimball and Bill Inmon's Data Warehouse methodology.
- Use database design and database modeling concepts to ensure data accessibility and security
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snow Flake Schemas.
- Responsible for delivering and coordinating data-profiling, data-analysis, data-governance, data-models (conceptual, logical, physical), data-mapping, data-lineage and data management.
- Worked onDataStageadmin activities like creating ODBC connections to variousDatasources, Server Start up and shut down, Creating Environmental Variables, CreatingDataStageprojects.
- Participated in all phases of project including Requirement gathering, Architecture, Analysis, Design, Coding, Testing, Documentation and warranty period.
- Worked with Data governance, Data quality, data lineage, Data architect to design various models and processes.
- Worked on SQL Server concepts SSIS (SQL Server Integration Services), SSAS (Analysis Services) and SSRS (Reporting Services).
- Generated and DDL (Data Definition Language) scripts using ER Studio and assisted DBA in Physical Implementation of Data Models.
- Extensively worked on creating the migration plan to Amazon web services (AWS).
- Extracted Mega Data from AWS, and Elastic Search engine using Sql Queries to create reports.
- Completed enhancement for MDM (Master data management) and suggested the implementation for hybrid MDM (Master Data Management).
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Generated comprehensive analytical reports by running SQL queries against current databases to conduct Data Analysis.
- Performed Data Analysis, Data Migration and data profiling using complex SQL on various sources systems including Oracle and Teradata
- Designed and documented Use Cases, Activity Diagrams, Sequence Diagrams, OOD (Object Oriented Design) using UML and Visio.
- Used forward engineering to generate DDL from the Physical Data Model and handed it to the DBA.
- Integrated Spotfire visualization into client's Salesforce environment.
- Involved in Normalization and De-Normalization of existing tables for faster query retrieval and designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
- Involved in Planning, Defining and Designing database using ER Studio on business requirement and provided documentation.
- Worked with BTEQ to submit Sql statements, import and export data, and generate reports in Teradata.
- Developed Full life cycle of Data Lake, Data Warehouse with Big data technologies like Spark and Hadoop.
- Created data masking mappings to mask the sensitive data between production and test environment.
- Responsible for all metadata relating to the EDW's overall data architecture, descriptions of data objects, access methods and security requirements.
- Involved in Data Profiling, Data cleansing and make sure the data is accurate and analyzed when it is transferring from OLTP to Data Marts and Data Warehouse.
- Used Agile Methodology of Data Warehouse development using Kanbanize.
- Worked with DBA group to create Best-Fit Physical Data Model from the Logical Data Model using Forward Engineering.
- Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
- Development ofDatastagedesign concepts, execution, testing and deployment on the client server
- Developed Linux Shell scripts by using Nzsql/Nzload utilities to load data from flat files to Netezza database.
- Validated the data of reports by writing SQL queries in PL/SQL Developer against ODS.
- Involved in user sessions and assisting in UAT (User Acceptance Testing).
Environment: ER Studio, AWS, OLTP, Teradata r15, Sqoop 1.4, Cassandra 3.11, MongoDB 3.6, HDFS, Linux, Shell, scripts, NoSQL, SSIS, SSAS, HBase 1.2, MDM.
Confidential, Plano, TX
Sr. Data Analyst /Modeler
Responsibilities:
- Worked on OLAP for data warehouse and data mart developments using Ralph Kimball methodology as well as OLTP models (3NF) and interacting with all the involved stakeholders and SME's to derive the solution.
- Conduct knowledge sharing sessions with the Architect and SMEs and design the Data Flow Diagram.
- Designed the ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and physical database (capacity planning, object creation and aggregation strategies) for Oracle and Teradata as per business requirements using Erwin
- Designed 3rd normal form target data model and mapped to logical model.
- Involved in extensive DATA validation using SQL queries and back-end testing
- Generated DDL statements for the creation of new ER/studio objects like table, views, indexes, packages and stored procedures.
- Design MOLAP/ROLAP cubes on Teradata Database using SSAS.
- Used SQL for Querying the database in UNIX environment
- Creation of BTEQ, Fast export, Multi Load, TPump, Fast load scripts for extracting data from various production systems .
- Working along with ETL team for documentation of transformation rules for data migration from OLTP to warehouse for purpose of reporting.
- Created views and extracted data from Teradata base tables and uploaded data to oracle staging server from Teradata tables, using fast export concept.
- Worked RDS for implementing models and data on RDS.
- Developed mapping spreadsheets for (ETL) team with source to target data mapping with physical naming standards, data types, volumetric, domain definitions, and corporate meta-data definitions.
- Designing Star schema and Snow Flake Schema on Dimensions and Fact Tables
- Worked with Data Vault Methodology Developed normalized Logical and Physical database models.
- Transformed Logical Data Model to Physical Data Model ensuring the Primary Key and Foreign key relationships in PDM, Consistency of definitions of Data Attributes and Primary Index considerations.
- Wrote and running SQL, BI and other reports, analyzing data, creating metrics/dashboards/pivots/etc.
- Gather and analyze business data requirements and model these needs. In doing so, work closely with the users of the information, the application developers and architects, to ensure the information models are capable of meeting their needs.
Environment: SQL Server, Erwin9.1, Oracle, Informatica, RDS, Big Data, JDBC, NOSQL, Star schema, Snow Flake Schema, Python, MySQL, PostgreSQL .
Confidential, Redmond, WA
Sr. Data Modeler /Analyst
Responsibilities:
- Involved in the projects from requirement analysis to better understand the requirements and support the development team with a better understanding of the data.
- Developed Data Mapping, Data Governance, Transformation and Cleansing rules for the Master Data Management Architecture involving OLTP, ODS and OLAP.
- Involved in Data Architecture, Data profiling, Data analysis, data mapping and Data architecture artifacts design.
- Responsible for Relational data modeling (OLTP) using MS Visio (Logical, Physical and Conceptual).
- Analyzed the data and provide resolution by writing analytical/complex SQL in case of data discrepancies.
- Involved in logical and Physical Database design & development, Normalization and Data modeling using Erwin and Sql Server Enterprise manager.
- Prepared ETL technical Mapping Documents along with test cases for each Mapping for future developments to maintain Software Development Life Cycle (SDLC).
- Designed OLTP system environment and maintained documentation of Metadata.
- Worked on Amazon Redshift and AWS and architecting a solution to load data, create data models.
- Created dimensional model for the reporting system by identifying required dimensions and facts using
- Used Reverse Engineering to connect to existing database and create graphical representation (E-R diagram)
- Using Erwin modeling tool, publishing of a data dictionary, review of the model and dictionary with subject matter experts and generation of data definition language.
- Coordinated with DBA in implementing the Database changes and also updating Data Models with changes implemented in development, QA and Production.
- Created and execute test scripts, cases, and scenarios that will determine optimal system performance according to specifications.
- Worked Extensively with DBA and Reporting team for improving the Report Performance with the Use of appropriate indexes and Partitioning
- Extensive experience in PL/Sql programming Stored Procedures, Functions, Packages and Triggers
- Data modeling in Erwin; design of target data models for enterprise data warehouse (Teradata)
- Designed and Developed Oracle, PL/Sql, Procedures, Linux and Unix Shell Scripts for data Import/Export and data Conversions.
- Experienced with BI Reporting in Design and Development of Queries, Reports, Workbooks, Business Explorer Analyzer, Query Builder, Web Reporting.
- Generated various reports using Sql Server Report Services (SSRS) for business analysts and the management team.
- Automated and scheduled recurring reporting processes using UNIX shell scripting and Teradata utilities such as Mload, BTEQ and Fast Load
- Participated in all phases including Analysis, Design, Coding, Testing and Documentation.
- Gathered and translated business requirements into detailed, production-level technical specifications, new features, and enhancements to existing technical business functionality.
- Involved in Data flow analysis, Data modeling, Physical database design, forms design and development, data conversion, performance analysis and tuning.
- Created and maintained data model standards, including master data management (MDM) and Involved in extracting the data from various sources like Oracle, Sql, Teradata, and XML.
- Worked with medical claim data in the Oracle database for Inpatient/Outpatient data validation, trend and comparative analysis.
- Used Load utilities (Fast Load & Multi Load) with the mainframe interface to load the data into Teradata.
- Optimized and updated UML Models (Visio) and Relational Data Models for various applications.
Environment: Erwin9.0, Oracle11g, Sql Server 2010, Teradata14, XML, OLTP, PL/Sql, Linux, UNIX, Mload, BTEQ, UNIX shell scripting
Confidential, Washington DC
Data Analyst / Modeler
Responsibilities:
- Performed Data Analysis, Data Migration and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Logical and physical database models to design OLTP system for applications using Erwin.
- Forward engineering to create a physical data model with DDL that best suits the requirements from the logical data model using Erwin for effective model management of sharing, dividing and reusing model information.
- Worked with BTEQ to submit SQL statements, import and export data, and generate reports in Teradata.
- Translated business requirements into working logical and physical data models for Data Warehouse, Data marts and OLAP applications.
- Involved using ETL tool Informatica to populate the database,datatransformation from the old database to the new database using Oracle.
- Identified the entities and relationship between the entities to develop Conceptual Model using ERWIN.
- Involved in the creation, maintenance ofData Warehouse and repositories containing Metadata.
- Wrote and executed unit, system, integration and UAT scripts in a Data Warehouse projects.
- Extensively used SQL, Transact SQL and PL/SQL to write stored procedures, functions, packages and triggers.
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data Warehouse, data mart reporting system in accordance with requirements.
- Excellent experience and knowledge on Data Warehouse concepts and dimensional data modelling using Ralph Kimball methodology.
- Developed separate test cases for ETL process (Inbound & Outbound) and reporting.
- Designed Star and Snowflake Data Models for Enterprise Data Warehouse using ERWIN
- Created and maintained Logical Data Model (LDM) for the project.Includes documentation of all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, glossary terms, etc.
Environment: Oracle, MS Visio, PL-SQL, Microsoft SQL Server 2000, Rational Rose, Datawarehouse, OLTP, OLAP, ERWIN, Informatica 9.x, Windows, SQL, PL/SQL, SQL Server, TalendDataQuality, Talend Integration Suite 4x, Oracle 9i, Flat Files, Windows, SVN.