Sr. Data Modeler/data Analyst Resume
Cincinnati, OH
SUMMARY
- Above 9+ years of experience in Data Analyst and Data Modeling, Data Development, Implementation and Maintenance of databases and software applications.
- Expert in writing SQL queries and optimizing the queries in Oracle, DB2, SQL Server and Teradata and performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Experience in Big Data Hadoop Ecosystem in ingestion, storage, querying, processing and analysis of big data.
- Experience in designing Star schema, Snowflake schema for Data Warehouse, ODS architecture.
- Experience in designing Logical, Physical & Conceptual data models for to build the Data Warehouse.
- Specialization in Data Modeling, Data warehouse design, Building conceptual Architect, Data Integration and Business Intelligence Solution.
- Solid Excellent experience in creating cloud based solutions and architecture using Amazon Web services (Amazon EC2, Amazon S3, and Amazon RDS) and Microsoft Azure.
- Experience working with Agile, Waterfall methodologies, Ralph Kimball and Bill Inmon approaches.
- Having good knowledge in Normalization and De - Normalization techniques for optimum performance in relational and dimensional database environments.
- Have good experience with Normalization (1NF, 2NF and 3NF) and De-normalization techniques for improved database performance in OLTP, OLAP and Datamart environments.
- Experienced in various databases Design of development and Production environment involving Oracle, SQL server, Netezza, MY SQL, DB2, MS Access, Teradata.
- Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools such as Ab Initio and Informatica Power Center Experience in testing and writing SQL and PL/SQL statements - Stored Procedures, Functions, Triggers and packages.
- Excellent experience on using Teradata SQL Assistant, Teradata Administrator, PMON and data load/export utilities like BTEQ, Fast Load, Multi Load, Fast Export, and Exposure to Tpump on UNIX/Windows environments and running the batch process for Teradata.
- Extensive experience in supporting Informatica applications, data extraction from heterogeneous sources using Informatica Power Center and experienced in automating and scheduling the Informatica jobs using UNIX shell scripting configuring Korn-jobs for Informatica sessions.
- Experience in designing error and exception handling procedures to identify, record and report errors.
- Solid hands on experience with administration of data model repository, documentation in Meta data portals in such as Erwin, ER Studio and Power Designer tools.
- Experience in conducting Joint Application Development (JAD) sessions with SMEs, Stakeholders and other project team members for requirements gathering and analysis.
- Software Development Life Cycle (SDLC) experience including Requirements, Specifications Analysis/Design and Testing as per the Software Development Life Cycle.
- Excellent knowledge on creating reports on SAP Business Objects, Webi reports for multiple data providers and experienced working with Excel Pivot and VBA macros for various business scenarios.
- Excellent experience in writing and executing unit, system, integration and UAT scripts in a data warehouse projects.
- Excellent experience in writing SQL queries to validate data movement between different layers in data warehouse environment and experience in troubleshooting test scripts, SQL queries, ETL jobs, data warehouse/data mart/data store models.
- Extensive knowledge and experience in producing tables, reports, graphs and listings using various procedures and handling large databases to perform complex data manipulations.
- Experience in testing Business Intelligence reports generated by various BI Tools like Cognos and Business Objects.
- Excellent in creating various artifacts for projects which include specification documents, data mapping and data analysis documents.
TECHNICAL SKILLS
Data Modeling Tools: Erwin Data Modeler 9.7/9.6/9.1, Erwin Model Manager, ER Studio v17, and Power Designer.
Programming Languages: SQL, PL/SQL, HTML5, XML and VBA.
Reporting Tools: SSRS, Power BI, Tableau, SSAS, MS-Excel, SAS BI Platform.
Big Data technologies: HBase, HDFS, Sqoop 1.4, Spark, Hadoop 3.0, Hive 2.3, Pig, MapReduce, NoSQL Databases.
Cloud Platforms: AWS, EC2, EC3, S3 Bucket, Redshift & MS Azure
OLAP Tools: Tableau 7, SAP BO, SSAS, Business Objects, and Crystal Reports 9
Databases: Oracle 12c/11g/10g, Teradata R15/R14, MS SQL Server 2016/2014, DB2, MongoDB, HBase and Cassandra.
Operating System: Windows, Unix, Sun Solaris
ETL/Data warehouse Tools: Informatica 9.6/9.1, SAP Business Objects XIR3.1/XIR2, SSIS, and Pentaho.
PROFESSIONAL EXPERIENCE
Sr. Data Modeler/Data Analyst
Confidential, Cincinnati, OH
Responsibilities:
- Gathered business requirements, working closely with business users, project leaders and developers. Analyzed the business requirements and designed conceptual and logical data models.
- Assumed leadership role in various divisions of Data Warehouse group such as the Business Analysis (group that defines thedatatransformation rules), the database architecture (the group that defines the logical and physical architecture), the ETL (with Datastage as the platform) and Business Intelligence (Reporting).
- Used Python for developing Spark code for faster processing ofdataon Hive and used Spark streaming to divide streamingdatainto batches as an input to spark engine for batch processing.
- Worked on Dimensional and RelationalDataModeling using Star and Snowflake Schemas, OLTP/OLAP system, Fact and Dimension tables,Conceptual,Logicaland Physicaldatamodeling using Erwin r9.6.
- Played key role in defining all aspects ofDataGovernance -dataarchitecture,datasecurity, masterdatamanagement,dataarchival & purging and metadata.
- Performed PoC for Bigdatasolution using ClouderaHadoopfordataloading anddataquerying
- Involved in T-SQL queries and optimizing the queries in Oracle 12c, SQL Server 2014, DB2, and Netezza,Teradata.
- Created MDM, OLAPdataarchitecture, analyticaldatamarts, and cubes optimized for reporting and worked on migrating of EDW to AWS using EMR and various other technologies
- Involved in Logical modeling using the Dimensional Modeling techniques such as Star Schema and Snow Flake Schema and involved in Normalization and De-Normalization of existing tables for faster query retrieval.
- Developed LINUX Shell scripts by using NZSQL/NZLOAD utilities to load data from flat files to Netezzadatabase.
- Developed scripts in Python (Pandas, Numpy) fordataingestion, analyzing anddatacleaning
- Worked in Dimension Data modeling concepts like Star Join Schema Modeling, Snow-Flake Modeling, FACT and Dimensions Tables, Physical and Logical Data Modeling.
- Collecting large amounts of logdata using Apache Flume and aggregating using PIG/HIVE in HDFS for further analysis.
- Developed logical & physicaldatamodel using data warehouse methodologies, including Star schema - Star-joined schemas, conformed dimensionsdataarchitecture, early/late binding techniques,datamodeling, designing & developing ETL applications using Informatica Power Center.
- Createddatamodels for AWS Redshift and Hive from dimensional data models and worked onDatamodeling, Advanced SQL with Columnar Databases using AWSand driven the technical design of AWS solutions by working with customers to understand their needs.
- Loadeddatainto Hive Tables from Hadoop Distributed File System (HDFS) to provide SQL-like access on Hadoopdata.
- Worked on TERADATA15 and utility domains, optimization of Queries in a Teradata database environment and worked in usingTeradatatools like Fast Load, Multi Load, T Pump, Fast Export, Teradata Parallel Transporter (TPT) and BTEQ.
- Worked in importing and cleansing of data from various sources like Teradata 15, Oracle, flat files, SQL Server with high volume data.
- Successfully loaded files to Hive and HDFS from Oracle and Involved in loadingdatafrom UNIX file system to HDFS and involved in the validation of the OLAP Unit testing and System Testing of the OLAP Report Functionality anddatadisplayed in the reports.
- Full life cycle of Data Lake,DataWarehouse with Bigdata technologies like Spark,Hadoop, Cassandra and developed enhancements to MongoDB architecture to improve performance and scalability and worked with MapReduce frameworks such as Hadoop and associated tools (pig, Sqoop, etc)
- DevelopedDataMapping, Data profiling,Data Governance, and Transformation and cleansing rules for the MasterData Management Architecture involving OLTP, ODS.
- Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
- Converted existing reports and dashboards from Tableau and Qlikview toMicroStrategy.
Environment: Erwinr9.6, Oracle 12c, Teradata15, Netezza, PL/SQL, T-SQL, MDM, BI(Tableau), DB2, SQL Sever2014, Informatica Power Center, SQL, Bigdata, Hadoop, Hive Queries, Microstrategy, MapReduce, Pig, Cassandra, MongoDB, SAS, Spark, SSRS, SSIS, SSAS, AWS, S3, Redshift, EMR, Tableau Excel, MS Access, SAP etc.
Sr. Data Modeler/ Data Analyst
Confidential
Responsibilities:
- Developing full life cycle software including defining requirements, prototyping, designing, coding, testing and maintaining software.
- Responsible for the data architecture design delivery, data model development, review, approval and Data warehouse implementation.
- Worked as a Data Modeler/Analyst to generate Data Models using SAP PowerDesigner and developed relational database system.
- Worked on Software Development Life Cycle (SDLC) with good working knowledge of testing, Agile methodology, disciplines, tasks, resources and scheduling.
- Researched and developed hosting solutions using MS Azure for service solution.
- Developed long term data warehouse roadmap and architectures, designs and builds the data warehouse framework per the roadmap.
- Developed logical data models and physical database design and generated database schemas using SAP PowerDesigner.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
- Visualized thedataby creating histograms, box plots and, performed ExploratoryData Analysis (EDA) in Python and Created new user features from rawdata,dataexploratory for each unusual/ unexplained feature by Python(pandasql) and PySpark(Filter, Groupby).
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
- Responsible for different Data mapping activities from Source systems to Teradata.
- Worked with business stakeholders to gather data quality requirements, action data profiling, and produce metrics to identify opportunities for improved testing and reporting.
- Implemented data quality process including transliteration, parsing, analysis, standardization, and enrichment at point of entry and batch modes.
- Developed Data Mapping, Data Governance, and Transformation and Cleansing rules for the Master Data Management Architecture.
- Lead Data Governance process with external and internal data providers to ensure timely, accurate and complete data.
- Document data dictionaries and business requirements for key workflows and process points
- Involved in Dimensional modeling (Star Schema) of the Data warehouse and used PowerDesigner to design the business process, dimensions and measured facts.
- Designed ER diagrams and mapping the data into database objects and involved in Data profiling in order to detect and correct inaccurate data and maintain the data quality.
- Developed Data Migration and Cleansing rules for the Integration Architecture (OLTP, ODS, DW)
- Implemented Forward Engineering by using DDL scripts and generating indexing strategies and reverse Engineered physical data models from SQL Scripts and databases.
- Worked with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.
- Designed the data warehouse architecture for all the source systems using MS Visio and wrote test plans and test cases in compliance with organizational standards.
- Developed and maintained an Enterprise Data Model (EDM) to serve as both the strategic and tactical planning vehicles to manage the enterprise data warehouse. This effort involves working closely with the business.
- Worked with project management, business teams and departments to assess and refine requirements to design/develop BI solutions using MS Azure.
- Worked with MDM systems team with respect to technical aspects and generating reports and worked with the data team to profile source data and determine source and metadata characteristics.
- Wrote HiveQL and Spark SQL for querying to retain insights of customersdatasuch as the time spend in ads, CTR, Queries stored and exported for visualization analysis in Tableau.
- Involved in Data profiling, Data analysis and data mapping artifacts design and worked on PL/SQL collections, index by table, arrays, bulk collect, FOR ALL, etc.
- Performed detailed data analysis to analyze the duration of claim processes and created the cubes with Star Schemas using facts and dimensions through SQL Server Analysis Services (SSAS)
- Generated various reports using SQL Server Report Services (SSRS) for business analysts and the management team.
- Designed OLTP system environment and maintained documentation of Metadata.
- Used Python, Tableau and Excel to analyze the number of products per customer and sales in a category for sales optimization.
Environment: SAP PowerDesigner16.6, Python, Pyspark, Agile, MDM, PL/SQL, SSAS, SSRS, ETL, OLTP, SQL Scripts, Big data, NoSQL, Hive, Informatica, Teradata 14, Oracle 12c, MongoDB, Cassandra, MS Visio, MS Azure.
Sr. Data Modeler/ Data Analyst
Confidential - San Diego, CA
Responsibilities:
- Gathered and translated business requirements into detailed, production-level technical specifications, new features, and enhancements to existing technical business functionality.
- Part of team conducting logical data analysis and data modeling JAD sessions, communicated data-related standards.
- Worked on NoSQL databases including Cassandra. Implemented multi-data center and multi-rack Cassandra cluster.
- Coordinated with Data Architects on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
- Performed Reverse Engineering of the current application using Erwin, and developed Logical and Physical data models for Central Model consolidation.
- Translated logical data models into physical database models, generated DDLs for DBAs and performed Data Analysis and Data Profiling and worked on data transformations and data quality rules.
- Involved in extensive data validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Created ETL packages using OLTP data sources (SQL Server 2008, Flat files, Excel source files, Oracle) and loaded the data into target tables by performing different kinds of transformations using SSIS.
- Extensively worked in Client-Server application development usingOracle 10g, Teradata 14, SQL, PL/SQL, OracleImport and Export Utilities.
- Used SSIS to create ETL packages to validate, extract, transform and load data to data warehouse databases, data mart databases, and process SSAS cubes to store data to OLAP databases
- Collected, analyze and interpret complex data for reporting and/or performance trend analysis and wrote and executed unit, system, integration and UAT scripts in a data warehouse projects.
- Extensively used ETL methodology for supporting data extraction, transformations and loading processing, in a complex DW using Informatica.
- Developed and maintain sales reporting using in MS Excel queries, SQL in Teradata, and MS Access.
- Involved in writing T-SQL working on SSIS, SSRS, SSAS, Data Cleansing, Data Scrubbing and Data Migration and written SQL scripts to test the mappings and Developed Traceability Matrix of Business
- Redefined many attributes and relationships in the reverse engineered model and cleansed unwanted tables/columns as part of Data Analysis responsibilities.
- Designed the data marts using the Ralph Kimball's Dimensional Data Mart modeling methodology using Erwin.
- Worked in importing and cleansing of data from various sources like Teradata, Oracle, flat files, with high volume data
- Involved in extensive data validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Created jobs, alerts to run SSIS,SSRSpackages periodically. Created the automated processes for the activities such as database backup processes and SSIS,SSRSPackages run sequentially using SQL Server Agent job and windows Scheduler.
- Created SQL tables with referential integrity, constraints and developed queries using SQL, SQL*PLUS and PL/SQL and performed GAP analysis of current state to desired state and document requirements to control the gaps identified.
- Developed the batch program in PL/SQL for the OLTP processing and used Unix Shell scripts to run in corn tab.
Environment: Erwin 9.5, Teradata14, Oracle10g, PL/SQL, MDM, Tableau, SQL Server 2008, ETL, Netezza, DB2, SSIS, SSRS, SAS, SPSS, Datastage, Informatica, SQL, T-SQL, UNIX, MySQL, Netezza, Aginity, MicroStrategy SQL assistance etc.
Sr. Data Modeler/ Data Analyst
Confidential
Responsibilities:
- Worked with Business Analysts team in requirements gathering and in preparing functional specifications and translating them to technical specifications.
- Worked with Business users during requirements gathering and prepared Conceptual, Logical and Physical Data Models.
- Planned and defined system requirements to Use Case, Use Case Scenario and Use Case Narrative using the UML (Unified Modeling Language) methodologies.
- Gather all the analysis reports prototypes from the business analysts belonging to different Business units; Participated in JAD sessions involving the discussion of various reporting needs.
- Reverse Engineering the existing data marts and identified the Data Elements (in the source systems), Dimensions, Facts and Measures required for reports.
- Conduct Design discussions and meetings to come out with the appropriate Data Warehouse at the lowest level of grain for each of the Dimensions involved.
- Created Entity Relationship Diagrams (ERD), Functional diagrams, Data flow diagrams and enforced referential integrity constraints.
- Designed a STAR schema for sales data involving shared dimensions (Conformed) for other subject areas using Erwin Data Modeler.
- Created and maintained Logical Data Model (LDM) for the project. Includes documentation of all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, glossary terms, etc.
- Validated and updated the appropriate LDM's to process mappings, screen designs, use cases, business object model, and system object model as they evolve and change.
- Defined facts, dimensions and designed the data marts using the Ralph Kimball's Dimensional Data Mart modeling methodology using ER Studio.
- Developed Data mapping, Data Governance, Transformation and Cleansing rules for the Data Management involving OLTP, ODS and OLAP.
- Normalized the database based on the new model developed to put them into the 3NF of the data warehouse.
- Used SQL tools like Teradata SQL Assistant and TOAD to run SQL queries and validate the data in warehouse.
- Created SSIS package for daily email subscriptions to alert Tableau subscription failure using the ODBC driver and PostgreSQL database.
- Designed logical and physical data models, Reverse engineering, Complete compare for Oracle and SQL server objects using Erwin.
- Involved in designing and developing SQL server objects such as Tables, Views, Indexes (Clustered and Non-Clustered), Stored Procedures and Functions in Transact-SQL.
- Developed scripts that automated DDL and DML statements used in creations of databases, tables, constraints, and updates.
Environment: ER/Studio, Teradata 13.1, SSIS, SAS, Excel, T-SQL, SSRS, Tableau, SQL Server, Cognos, Pivot tables, Graphs, MDM, PL/SQL, ETL, DB2, Oracle 11g/10g, SQL, MySQl, Power BI, UNIX Shell scripting, MicroStrategy, XML, SQL*Loader, SQL*Plus, Informatica Power Center etc.
Data Analyst/Data Modeler
Confidential
Responsibilities:
- Involved in Data mapping specifications to create and execute detailed system test plans. The data mapping specifies what data will be extracted from an internal data warehouse, transformed and sent to an external entity.
- Analyzed business requirements, system requirements, data mapping requirement specifications, and responsible for documenting functional requirements and supplementary requirements in Quality Center.
- Setting up of environments to be used for testing and the range of functionalities to be tested as per technical specifications.
- Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
- Wrote and executed unit, system, integration and UAT scripts in a data warehouse projects.
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance with requirements.
- Troubleshoot test scripts, SQL queries, ETL jobs, data warehouse/data mart/data store models.
- Excellent experience and knowledge on data warehouse concepts and dimensional data modelling using Ralph Kimball methodology
- Responsible for different Data mapping activities from Source systems to Teradata
- Created the test environment for Staging area, loading the Staging area with data from multiple sources.
- Used CA Erwin Data Modeler (Erwin) for data modeling (data requirements analysis, database design etc.) of custom developed information systems, including databases of transactional systems and data marts.
- Responsible for analyzing various data sources such as flat files, ASCII Data, EBCDIC Data, Relational Data (Oracle, DB2 UDB, MS SQL Server) from various heterogeneous data sources.
- Delivered file in various file formatting system (ex. Excel file, Tab delimited text, Coma separated text, Pipe delimited text etc.)
- Performed ad hoc analyses, as needed, with the ability to comprehend analysis as needed
- Involved in Teradata SQL Development, Unit Testing and Performance Tuning and to ensure testing issues are resolved on the basis of using defect reports.
- Tested the ETL process for both before data validation and after data validation process. Tested the messages published by ETL tool and data loaded into various databases
- Experience in creating UNIX scripts for file transfer and file manipulation and provide support to client with assessing how many virtual user licenses would be needed for performance testing.
- Ensuring onsite to offshore transition, QA Processes and closure of problems & issues and tested the database to check field size validation, check constraints, stored procedures and cross verifying the field size defined within the application with metadata.
Environment: ERwin, MDM, ETL, Informatica, PL/SQL, Oracle10g, UNIX Shell scripting, Linux Shell scripting, slick Edit, TOAD, MS SQL Server 2008, PL/SQL, XML, SQL*Loader, SQL*Plus, DB2, Oracle, SSIS, IBM etc.