Sr. Data Architect/data Modeler Resume
Tallahassee, FL
SUMMARY
- Above 8+ years of Experience working as a Sr. Data Architect/Data Modeler and Data Analyst with emphasis on Data Mapping, Data Validation in Data Warehousing Environment.
- Expertise in many Software Development Life Cycle (SDLC) implementations, performing process planning, and unit, integration, regression testing, and system maintenance
- Good understanding and knowledge with Agile and Waterfall environments.
- Developing data pipeline using Sqoop, Pig and MapReduce to ingest workforce data into HDFS for analysis.
- Data Ingress and Egress using Azure Data Factory from HDFS to Relational DatabaseSystems and vice - versa.
- Excellent experience on AWScloud services (Amazon Redshift and Data Pipeline).
- Experience in developing Entity-Relationship ER diagrams and modeling Transactional Databases and Data Warehouse using tools like ERWIN, ER/Studio and Power Designer.
- Strong Database experience using Oracle, Teradata, Big data and NoSQL.
- Excellent understanding of HubArchitectureStyle for MDMhubs the registry, repository and hybrid approach.
- Strong experience with architecting highly performance databases using PostgreSQL, PostGIS,MYSQL and Cassandra.
- Responsible for developing and maintaining a formal description of the data and data structures - this can include data definitions, data models Conceptual, Logical and Physical Data Model,data dictionaries,data flow diagrams, etc.
- Excellent experience in writing SQL queries to validate data movement between different layers in data warehouse environment.
- Extensive knowledge of big data, Hadoop, Map-Reduce, Hive, NoSQL Databases and otheremerging technologies.
- Expertise in developing transactional enterprise data models that strictly meet normalization rules, as well as Enterprise Data Warehouses using Kimball and Inmon Data Warehousemethodologies
- Experience in designing star schema, Snowflake schema for Data Warehouse, ODS architecture.
- Experience in importing and exporting data using Sqoop from HDFS to Relational DatabaseSystems (RDBMS) and from RDBMS to HDFS.
- Experience in generating and documenting Metadata while designing OLTP and OLAP systems environment.
- Experience in developing MapReduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
- Excellent experience in developing Stored Procedures,Triggers, Functions, Packages, Inner Joins & Outer Joins, views using T SQL/PL/SQL
- Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
- Experience designing security at both the schema level and the accessibility level in conjunction with the DBAs
- Excellent experience in troubleshooting test scripts, SQL queries, ETL jobs, data warehouse/datamart/data store models.
- Excellent experience in Data mining with querying and mining large datasets to discover transition patterns and examine financial data.
- Strong experience in using Excel and MS Access to dump the data and analyze based on businessneeds.
- Exceptional communication and presentation skills and established track record of client interactions
TECHNICAL SKILLS
Data Modeling Tools: Erwin 9.8/9.7, ER/Studio V17, Power Designer
Big Data tools: Hadoop3.0, HDFS, Hive 2.3, Kafka1.1, Scala, Oozie4.3, Pig 0.17, HBase 1.2, Sqoop 1.4, AWS
Cloud Services: AWS, Amazon Redshift, AZURESQL, Azure Synapse, Azure Data Lake, Azure Data FactoryandGCP.
Programming Languages: SQL, T-SQL, UNIX shells scripting, PL/SQL.
Project Execution Methodologies: JAD, Agile, SDLC, Waterfall, and RAD
Database Tools: Oracle 12c/11g, Teradata15/14, MDM.
Reporting tools: SQL Server Reporting Services (SSRS), Tableau, Crystal Reports, Strategy, Business Objects
ETL Tools: SSIS, Informaticav10.
Operating Systems: Microsoft Windows 10/8/7, UNIX
PROFESSIONAL EXPERIENCE
Confidential - Tallahassee, FL
Sr. Data Architect/Data Modeler
Responsibilities:
- Massively involved in Sr.Data Architect/Data Modeler role to review business requirement and compose source to target data mapping documents.
- Designed and developed architecture for data servicesecosystem spanning Relational, NoSQL, and Big Data technologies.
- Involved with all the phases of Software Development Life Cycle (SDLC) methodologies throughout the project life cycle.
- Participated in JAD sessions for design optimizations related to data structures as well as ETLprocesses
- Designed and Developed Data Collectors and Parsers by using Python.
- Extensively used data pipeline using Sqoop to import customer behavioral data and historical utility data from data sources such as Oracle into HDFS.
- Involved with Azure Data Factory,Azure Data Lake,Azure Databricks, CosmosDB, AzureSynapse Analytics, Azure SQL, And Azure Data Warehouses.
- Worked with DBA to create Best-Fit PhysicalData Model from the logical Data Model using Forward Engineering in Erwin.
- Designed, created and maintained QlikView applications
- Analyzed the SQL scripts and designed the solution to implement using PySpark
- Worked on big data technologiesHive SQL, Sqoop, Hadoop and MapReduce
- Developed data Mart for the base data in Star Schema, Snow-Flake Schema
- Involved in all the steps and scope of the project reference data approach to MDM
- Worked with Azure BLOB and Data lake storage and loading data into Azure SQL Synapse analytics (DW).
- Utilized Oozieworkflow to run Pig and Hive Jobs Extracted files from Mongo DB through Sqoop and placed in HDFS.
- Worked with Data governance, Dataquality, datalineage, Dataarchitect to design various models and processes.
- Worked on Azure and architecting a solution to load data, create data models and run BI on it.
- Implemented Spark Sql to update queries based on the business requirements.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models
- Worked as Architect and build data marts using hybrid Inmon and Kimball DW methodologies
- Developed triggers, stored procedures,functions and packages using cursors and ref cursor concepts associated with the project using PL/SQL
- Data sources are extracted, transformed, and loaded (ETL) to generate CSV data files with Python programming and SQL queries.
- Built real time pipeline for streaming data using Kafka and Spark Streaming.
- Extensively worked on creating role playing dimensions Fact.
- Created Hivetables on top of HBase using Storage Handler for effective OLAP analysis
- Worked with SQL Azure Data Warehouse and schedule ‘copy data load’ of on-primes data
- Handled importing of data from various data sources, performed transformations using Hive,MapReduce.
- Developed data pipelines, use of data sets of Azure Data Factory
- Loaded data into HDFS and extracted the data from Oracle into HDFS using Sqoop.
- Designed and build relational database models and defines data requirements to meet the business requirements.
Environment: Erwin9.8, Python, SQL, Oracle12c,AzureDataLake, Azure Data bricks, CosmosDB, Synapse Analytics, AzureSQL, BI, Big Data3.0, NoSQL, DBA, QlikView, Oozie4.3, MongoDB,Kafka1.1, PySpark, Hive2.3, Sqoop1.4, Hadoop3.0, MapReduce, MDM, Azure BLOB, Spark SQL, BI, ETL, PL/SQL, OLAP, ODS, OLTP, HDFS.
Confidential - Piscataway NJ
Data Architect/Data Modeler
Responsibilities:
- Monitored and measured data architecture processes and standards to ensure value is being driven and delivered as expected.
- Worked on the reporting requirements and involved in generating the reports for the DataModel using crystalreports
- Used Agile Methodology of Data Warehousedevelopment using Kanbanize.
- Build data pipeline to move data between on premise data, use of management Gateway and copy data and deploying
- Developed Python scripts to automate and provide Control flow to Pig scripts.
- Implemented logical and physical data modeling with Star and Snowflakes techniques using Erwin in DataMart.
- Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
- Created SQL tables with referential integrity, constraints and developed queries using SQL and PL/SQL.
- Created Hive architecture used for real time monitoring and HBase used for reporting
- Implemented DataArchiving strategies to handle the problems with large volumes of data by moving inactive data to another storage location that can be accessed easily.
- Worked on google cloud platform (GCP)services like compute engine, cloud load balancing,cloud storage, cloud SQL, stack driver monitoring and cloud deployment manager.
- Created a Data Dictionary and Mapping from Sources to the Target in MDM Data Model.
- Loaded real time data from various data sources into HDFS using Kafka.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snow flake Schemas.
- Designed Metadata Repository to store data definitions for entities, attributes & mappings between data warehouse and source system data elements.
- Performed Reverse Engineering of the legacy application using DDL scripts in Erwin
- Created Hive queries that helped analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Created several types of data visualizations using Python and Tableau.
- Coordinated with DBA on data base build and table normalizations and de-normalizations
- Processed the data using HQL (like SQL) on top of Map-reduce.
- Worked in importing and cleansing of data from various sources like Oracle, flat files, with high volume data
- Worked with Data governance and Data quality to design various models and processes.
- Involved in Oozieandworkflow scheduler to manage hadoop jobs with control flows.
- Worked on POC to check various cloud offerings including Google Cloud Platform (GCP).
- Worked on Google Cloud PlatformServices like Vision API, Instances.
- Tested the messages published by ETLtool and data loaded into various databases
Environment: Erwin9.8, Agile, Python, SQL, PL/SQL, Hadoop3.0, Oozie4.3, GCP, ETL, Tableau, Oracle12c, ODS, OLTP, Kafka1.1, HDFS, Pig0.17, HBase1.2, Hive2.3, and MDM.
Confidential - Juno Beach, FL
Data Modeler/Data Architect
Responsibilities:
- Prepared data models from existing archived data to support the reporting efforts. Assisted business analysts in mapping the data from source to target
- Implemented a proof of concept deploying this product in Amazon Web Services.
- Implemented logical and physical relational database and maintained Database Objects in the data model using ER/Studio
- Designed the new Teradatadatawarehouse with star schema & snowflake schema for most efficient use of Micro Strategy & Teradata resources.
- Used Normalization and De-normalization techniques for effective performance in OLTP and OLAP systems.
- Developed SQL scripts for creating tables, Sequences, Triggers, views and materialized views.
- Designed and Developed PL/SQL procedures, functions and packages to create Summary tables.
- Created scripts for importing data into HDFS/Hive using Sqoop.
- Developed normalizedLogical and Physical databasemodels for designing an OLTP application.
- Worked with reversed engineerData Model from Databaseinstance and Scripts.
- Designed of Redshift Data model, Redshift Performance improvements/analysis
- Created Rich dashboards using TableauDesktop and prepared user stories to create compelling dashboards to deliver actionable insights.
- Extracted the data from AWS RedShift into HDFS using Sqoop.
- Performed various data analysis at the source level and determined the key attributes for designing of Fact and Dimensiontables using star schema for an effective Data Warehouse and Data Mart.
- Developed Pig scripts to parse the raw data, populate staging tables and store the refined data.
- Worked with data compliance teams, Data governanceteam to maintain data models,Metadata, Data Dictionaries define source fields and its definitions.
- Created mapping tables to find out the missing attributes for the ETL process.
- Worked closely with the SSIS, SSRS Developers to explain the complex data transformation using Logic.
- Developed various T-SQLstored procedures, triggers, views and adding/changing tables for data load, transformation and extraction.
- Created ad-hoc reports to users in Tableau by connecting various data sources
Environment: ER/Studio, Teradata, OLAP, OLTP, SQL, PL/SQL, HDFS, Hive, Sqoop, AWS, HDFS, Pig, ETL SSIS, SSRS, T-SQL, Tableau.
Confidential - San Jose, CA
Data Analyst/Data Modeler
Responsibilities:
- Worked on data analysis, data profiling, data modeling, data mapping.
- Involved in Migrating the data model from one database to Teradata database.
- Created physical and logicaldatamodels using Power Designer.
- Used Python scripts to update the content in database and manipulate files
- Developed and implemented data cleansing, data security, data profiling and data monitoring
- Developed dashboards using Tableau Desktop.
- Generated DDL scripts for database modification, Views and set tables.
- Created SSIS Packages using Pivot Transformation, ExecuteSQL Task, Data Flow Task, etc to import data into the data warehouse.
- Processed and generated DDL and created the tables and views in the corresponding architectural layers.
- Developed triggers, stored procedures, functions and packages using cursors and ref cursor concepts associated with the project using Pl/SQL
- Created customized report using OLAP Tools such as Crystal Report for business use.
- Worked on Teradata and its utilities - tpump, fastload through Informatica.
- Developed SQL scripts for creating tables, Sequences, Triggers, views and materialized views.
- Completed data quality management using information steward and did extensive dataprofiling.
- Developed and maintained data dictionary to create metadatareports for technical and business purpose.
- Enforced referential integrity in the OLTP data model for consistent relationship between tables and efficient database design.
- Created reports using SQL Reporting Services (SSRS) for customized and ad-hoc Queries.
- Involved in user training sessions and assisting in UAT (User Acceptance Testing).
- Loaded multi format data from various sources like flat-file, Excel, MS Access and performing file system operation.
- Performed literature searches and ad-hoc data collection based on requests.
Environment: Power Designer,Python, Teradata, SQL, PL/SQL, OLAP, OLTP, SSIS, Tableau, SSRS, MSExcel.
Confidential - Ashburn, VA
Data Analyst
Responsibilities:
- Performed Data Validation and Data Reconciliation between disparate source and target systems for various projects.
- Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
- Created customized report using OLAPTools such as Crystal Report for business use.
- Created or modified the T-SQL queries as per the business requirements.
- Generated various reports using SQL Server Report Services (SSRS) for business analysts and the management team.
- Wrote complexSQL queries for validating the data against different kinds of reports generated by Business Objects.
- Created Column Store indexes on dimension and fact tables in the OLTP database to enhance read operation.
- Involved in extensive Data validation by writing several complexSQL queries.
- Developed regression test scripts for the application.
- Worked closely with the SSIS Developers to explain the complex Data Transformation using Logic.
- Managed timely flow of business intelligenceinformation to users.
- Migrated critical reports using PL/SQL&UNIX packages.
- Created and scheduled the job sequences by checking job dependencies.
- Involved in metrics gathering, analysis and reporting to concerned team and tested the testing programs.
- Used advanced Microsoft Excel to create pivot tables.
- Developed re-usable components in Informatica, and UNIX.
- Created ad-hoc reports to users in Tableau by connecting various data sources
Environment: OLAP, T-SQL, SSIS, SSRS, Excel, OLTP, Informatica, Tableau.