Sr. Data Architect/data Modeler Resume
Merrimack, NH
SUMMARY:
- Over 9+ years of strong IT experienced in Data Architecture, Data Modeling, and Big Data Reporting Design and Development.
- Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export
- Strong experience in using Excel and MS Access to dump the data and analyze based on business needs.
- Experience in analyzing data using Hadoop Ecosystem including HDFS, Hive, Spark, Spark Streaming, Elastic Search, Kibana, Kafka, HBase, Zookeeper, PIG, Sqoop, and Flume.
- Design the Data Marts in dimensional data modeling using star and snowflake schemas.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Management Systems (RDBMS) and from RDBMS to HDFS.
- Well versed in Normalization / De - normalization techniques for optimum performance in relational and dimensional database environments.
- Experience in developing MapReduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
- Extensive experience in using ER modeling tools such as Erwin and ER/Studio, Teradata, MLDM and MDM
- Responsible for detail architectural design and data wrangling, data profiling to ensure data quality of vendor data, Source to target mapping
- Extensive experience on usage of ETL & Reporting tools like SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS)
- Proficient knowledge in metadata design, real time BI Architecture including Data Governance for greater ROI.
- Experience in designing, building and implementing complete Hadoop ecosystem comprising of Map Reduce, HDFS, Hive, Pig, Sqoop, Oozie, Cassandra, HBase, and MongoDB.
- Experience in Teradata SQL queries, Teradata Indexes, Utilities such as Mload, Tpump, Fast load and Fast Export.
- Excellent knowledge in Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
- Analyze raw data from internal and external data and demographic sources and apply data mining techniques.
- Proficient knowledge in designing security at both the schema level and the accessibility level in conjunction with the DBAs
- Experience with Object Oriented Analysis and Design (OOAD) using UML, Rational Unified Process (RUP), Rational Rose and MS Visio.
- Experience in designing error and exception handling procedures to identify, record and report errors.
- Expertise in creating DDL scripts for implementing Data Modeling changes.
- Heavy use of Access queries, V-Lookup, formulas, Pivot Tables, etc. working knowledge of CRM Automation Salesforce.com, SAP.
- Assist in creating communication materials based on data for key internal /external audiences.
- Good knowledge of problem solving and analytical skills with exceptional ability to learn and master new technologies efficiently.
PROFESSIONAL EXPERIENCE:
Confidential - Merrimack NH
Sr. Data Architect/Data Modeler
Responsibilities:
- Worked as Data Architect/Data Modeler role which involved Data Modeling, ETL Architecture & Oracle DBA
- Involved in entire software development life cycle (SDLC) including analysis, design, development and testing of software applications
- Analyzed the underlying databases, identified the gaps and conducted JAD sessions with the SME's to meet the user business requirement.
- Worked with multiple Microsoft SME to define requirements for Azure POC including project plan and execution with (Agile/Waterfall) approaches.
- Used Big Data Tools like Map Reduce, HDFS, Hive SQL, Hive PL/SQL, Impala, Pig and Sqoop etc.
- Developed Teradata SQL scripts using OLAP and OLTP functions like rank and rank over to improve the query performance while pulling the data from large tables.
- Created Hive queries that helped market analysts spot emerging trends by comparing incremental data with Teradata reference tables and historical metrics.
- Designed the Logical Data Model using Erwin with the entities and attributes for each subject areas.
- Generated reports using Global Variables, Expressions and Functions using SSIS.
- Loaded and transformed large sets of structured, semi structured and unstructured data using concepts.
- Developed enhancements to Mongo DB architecture to improve performance and scalability.
- Involved in Normalization / De normalization techniques for optimum performance in relational and dimensional database environments.
- Developed long term data warehouse roadmap and architectures, designs and builds the data warehouse framework per the roadmap.
- Implemented Join optimizations in Apache Pig using Skewed and Merge joins for large datasets schema.
- Used SSRS to create reports, customized Reports, on-demand reports, and involved in analyzing multi-dimensional reports in SSRS.
- Designed and developed T-SQL stored procedures to extract, aggregate, transform, and insert data.
- Implemented strong referential integrity and auditing by the use of triggers and SQL Scripts.
- Focused on architecting NoSQL databases like Mongo DB, Cassandra and Cache database.
- Created and maintained SAS Datasets that are extracted from an Oracle Database.
- Performed as a Data Analysis, Data Migration and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Reviewed Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Worked with Architecture team to get the metadata approved for the new data elements that are added for this project.
- Performed reverse engineering of physical data models from databases and SQL scripts.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving ODS.
- Developed data Mart for the base data in Star Schema, Snow-Flake Schema involved in developing the data warehouse for the database
- Developed and implemented data cleansing, data security, data profiling and data monitoring processes.
- Established uniform Master Data Dictionary and Mapping rules for metadata, data mapping and lineage.
- Designed and developed a Data Lake using Hadoop for processing raw and processed claims via Hive and Informatica.
- Developed and automated multiple departmental Reports using Tableau Software and MS Excel.
- Developed and implemented different Pig UDFs to write ad-hoc and scheduled reports as required by the Business team.
Environment: Erwin9.8, Oracle12c, Teradata r15, Hadoop 3.0, HDFS, Hive 2.3, Apache Pig 0.17, MDM, MapReduce, ETL, Informatica, Tableau 10.5, ODS, SQL, PL/SQL, OLTP, OLAP, Agile, Sqoop 1.4, SSIS, SSRS, MongoDB 3.6, SAS.
Confidential - Harrisburg, PA
Data Architect/Data Modeler
Responsibilities:
- Involved in documentation of Data Architect/Data Modeler and ETL specifications for Data warehouse Erwin.
- Worked on AWS Redshift and RDS for implementing models and data on RDS and Redshift.
- Involved in multiple project teams of technical professionals through all phases of the SDLC using technologies including Oracle.
- Created complex stored procedures, Functions, Triggers Indexes, Tables, Views and SQL joins for applications.
- Worked on Normalization and De-Normalization techniques for both OLTP and OLAP systems.
- Generated periodic reports based on the statistical analysis of the data using SQL Server Reporting Services (SSRS).
- Developed Apache Hive and MapReduce tools to design and manage HDFS data blocks and data distribution methods.
- Involved in Ralph Kimball and Bill Inman Methodologies (Star Schema, Snow Flake Schema).
- Analyzed of functional and non-functional categorized data elements for data profiling and mapping from source to target data environment.
- Wrote Oracle, PL/SQL stored procedures, functions and packages and triggers to implement business rules into the application.
- Provided technical guidance for re-engineering functions of Teradata warehouse operations into Netezza.
- Worked on importing and exporting data from DB2 into HDFS and Apache Hive using Sqoop.
- Created new database objects like Procedures, Functions, Packages, Triggers, Indexes and Views using T-SQL in SQL Server.
- Used Informatica mapping parameters, variables and different tasks such as command task, decision task and timer.
- Established uniform Master Data Dictionary and Mapping rules for metadata, data mapping and lineage.
- Worked on a POC to compare processing time of Impala with Apache Hive for batch applications to implement the former in project.
- Worked with Data Vault Methodology Developed normalized Logical and Physical database models
- Created a list of domains in Erwin and worked on building up the data dictionary for the company
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
- Used External Loaders like Multi Load, T Pump and Fast Load to load data into Teradata Database.
- Worked with Data Steward Team for designing, documenting and configuring Informatica Data Director for supporting management of MDM data.
- Loaded multi format data from various sources like flat-file, Excel, MS Access, and Oracle using Netezza and performing file system operation.
Environment: Erwin r9.7, Oracle 12c, Teradata r15, MDM, Netezza, HDFS, Apache Hive 2.3, AWS, OLAP, OLTP, MapReduce, SSRS, SQL, PL/SQL, Sqoop 1.4.
Confidential - Washington, DC
Data Modeler
Responsibilities:
- Gathered and translated business requirements into detailed, production-level technical specifications, new features, and enhancements to existing technical business functionality.
- Analyzed conceptual into logical data and had JAD sessions and also communicated data related issues and standards.
- Built a Data warehouse using Bill Inmon's approach in order to support reporting needs.
- Developed Source to Target mapping documents by analyzing data content and physical data structures.
- Created DDL scripts for static dimensions and data model for the star schema using Erwin.
- Created Data Design document included data specifications, which are used by development team.
- Involved in extensive data validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Worked on the reporting requirements and involved in generating the reports for the Data Model using Crystal reports
- Worked on Data governance, data quality, data lineage establishment processes.
- Extensively worked with SSIS to load the data from source systems, and run in periodic intervals
- Worked with data transformations in both normalized and de-normalized data environments
- Implemented Snow-flake schema to ensure no redundancy in the database.
- Prepared and maintained database architecture and modeling policies, procedures and standards.
- Developed the Data warehouse model (Kimball's) with several data marts and conformed dimensions for the proposed model in the Project.
- Gathered and documented the Audit trail and traceability of extracted information for data quality.
- Worked on the reporting requirements and involved in generating the reports for the Data Model.
- Created 3 NF business area data modeling with de-normalized physical implementation; data and information requirements analysis.
- Created PL/SQL packages and Database Triggers and developed user procedures and prepared user manuals for the new programs.
- Developed Data mapping, Transformation and Cleansing rules for the Data Management involving OLAP and OLTP
- Developed SQL Queries to fetch complex data from different tables in remote databases using joins, database links and bulk collects.
- Analyzed, retrieved and aggregated data from multiple datasets to perform data mapping.
- Created E/R Diagrams, Data Flow Diagrams, grouped and created the tables, validated the data, for lookup tables.
Environment: Erwin 9.6, OLAP, OLTP, SQL, SSIS, DDL, SQL, Crystal reports 2015, PL/SQL, Triggers, 3NF, Data Mart.
Confidential - New York, NY
Data Modeler/Data Analyst
Responsibilities:
- Connected to Redshift through Tableau to extract live data for real time analysis.
- Identified/documented data sources and transformation rules required populating and maintaining data warehouse content.
- Extensively Generated DDL's and make the same available to the DBA for execution
- Trained Spotfire tool and gave guidance in creating Spotfire Visualizations to couple of colleagues
- Established and maintained comprehensive data model documentation including detailed descriptions of business entities, attributes, and data relationships.
- Developed data Mart for the base data in Star Schema, Snow-Flake Schema involved in developing the data warehouse for the database.
- Analyzed the physical data model to understand the relationship between existing tables.
- Extensively used ER Studio as the main tool for modeling along with MS Visio
- Designed data process flows using Informatica to source data into Statements database on Oracle platform.
- Involved in Data profiling in order to detect and correct inaccurate data and maintain the data quality.
- Performed Normalization of the existing OLTP systems (3rd NF), to speed up the DML statements execution time.
- Performed data cleaning and data manipulation activities using NZSQL utility.
- Developed triggers, stored procedures, functions and packages using cursors and ref cursor concepts associated with the project using PL/SQL
- Worked on creating filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
- Designed and developed the universe and map them with SAP report fields.
- Designed a STAR schema for sales data involving shared dimensions (Conformed) for other subject areas using E/R Studio.
- Used Informatica & SAS to extract transform & load source data from transaction systems.
- Generated various reports using SQL Server Report Services (SSRS) for business analysts and the management team.
- Designed and Developed Use Cases, Activity Diagrams, and Sequence Diagrams using Unified Modeling Language (UML)
- Created a Data Mapping document after each assignment and wrote the transformation rules for each field as applicable
- Worked on Unit Testing for three reports and created SQL Test Scripts for each report as required
Environment: ER/Studio 10.2, SAS, SSIS Vs 2014, SSRS, PL/SQL, UML, Informatica, Tableau 8.2, NZSQL, OLTP, 3NF, DDL.
Confidential
Data Analyst
Responsibilities:
- Designed and implemented business intelligence to support sales and operations functions to increase customer satisfaction.
- Involved in data analysis, data discrepancy reduction in the source and target schemas.
- Developed complex PL/SQL procedures and packages using views and SQL joins.
- Developed of reports using different SSIS Functionalities like sort prompts and cascading parameters, Multi Value Parameters.
- Conducted detailed analysis of the data issue, mapping data from source to target, design and data cleansing on the Data Warehouse
- Involved in identifying the Data requirements and creating Data Dictionary for the functionalities
- Analyzed and build proof of concepts to convert SAS reports into tableau or use SAS dataset in Tableau.
- Created or modifying the T-SQL queries as per the business requirements.
- Developed and optimized stored procedures for use as a data window source for complex reporting purpose.
- Performed the batch processing of data, designed the SQL scripts, control files, batch file for data loading.
- Coordinated with data stewards / data owners to discuss the source data quality issues and resolving the issues based on the findings.
- Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
- Involved with data profiling for multiple sources and answered complex business questions by providing data to business users.
- Worked on SQL Server concepts SSIS (SQL Server Integration Services), SSAS (Analysis Services) and SSRS (Reporting Services).
- Developed database objects including tables, Indexes, views, sequences, packages, triggers and procedures to troubleshoot any database problems
- Worked on different data formats such as Flat files, SQL files, Databases, XML schema, CSV files
- Involved in designing Parameterized Reports for generating ad-hoc reports as per the business requirements
Environment: SAS, Tableau 8.1, Ad-hoc, SQL, T-SQL, Flat Files, SSIS Vs 2013, SSRS, SSAS, SML, Business Intelligence.
TECHNICAL SKILLS:
Data Modeling: Erwin 9.7, Toad, ER studio 9.7, Star-Schema Modeling, Snowflake-Schema Modeling, FACT and dimension tables, Pivot Tables
Big Data Tools: Apache Hadoop 3.0, MapReduce, Sqoop 1.4, Pig 0.15, Hive 2.3, NoSQL, Cassandra 3.11, MongoDB 3.6, Spark 2.2, HBase 1.2, and Scala 2.1
Languages: PL SQL, T-SQL, Unix Shell scripting, XML
Database: Oracle 12c/11g, MS SQL Server2016/2014, DB2, Teradata 14/15, DB2, Netezza, Cassandra.
Big Data: Hadoop 3.0, HDFS 2, Hive 2.3, Pig 0.17, HBase 1.2, Sqoop, Flume, Splunk
Testing Tools: Win Runner, Load Runner, Test Director, Mercury Quality Center, Rational Clear Quest
BI Tools: Tableau 7.0/8.2, Pentaho 6, SAP Business Objects, Crystal Reports
ETL/Data warehouse Tools: Informatica, SAP Business Objects XIR3.1/XIR2, Talend, Pentaho
Operating System: UNIX, Windows 8/7, Linux, Red Hat
Other Tools: TOAD, BTEQ, MS-Office suite (Word, Excel, Project and Outlook)
