Senior Data Modeler Resume
Chicago, IL
SUMMARY:
- Over 10+ years of strong experience in Business and Data Modeling/Data Analysis, Data Architecture, Data Profiling, Data Migration, Data Conversion, Data Quality, Data Governance, Data Integration, MDM, NoSQL and Metadata Management Services and Configuration Management.
- Experienced on data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining, machine learning and advanced data processing.
- Experienced in Dimensional Data Modeling experience using Data modeling, Relational Data modeling, ER/Studio, Erwin, and Sybase Power Designer, Star Join Schema/Snowflake modeling, FACT & Dimensions tables, Conceptual, Physical & logical data modeling.
- Experienced in writing Pig Latin scripts, MapReduce jobs and HiveQL.
- Expertise in the Data Analysis, Design, Development, Implementation and Testing using Data Conversions, Extraction, Transformation and Loading (ETL) and SQL Server, ORACLE and other relational and non - relational databases.
- Extensive knowledge of big data, Hadoop, Map-Reduce, Hive, NoSQL Databases and other emerging technologies.
- Extensively worked on ERWIN and ER Studio tool with all features like REVERSE Engineering, FORWARD Engineering, SUBJECT AREA, DOMAIN, Naming Standards Document etc.
- Strong expertise on Amazon AWS EC2, Dynamo DB, S3, Redshift, and other services and hands on experience on Hadoop /Big Data related technology experience in Storage, Querying, Processing and analysis of data.
- Experienced in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Strong experience in using Excel and MS Access to dump the data and analyze based on business needs.
- Experience in using Business Intelligence tools (SSIS, SSAS, SSRS) in MS SQL Server 2008 & 2012, IBM Cognos and Tableau.
- Expertise in Normalization to 3NF/Denormalization techniques for optimum performance in relational and dimensional database environments.
- Experienced in carrying out Software Development Life Cycle (SDLC) in relational and object methodologies.
- Gained strong expertise in Conceptual Data Modeling, Logical Data Modeling, Physical Data Modeling, Enterprise Data Warehouse Design, Datamart Design, Metadata, Data Quality, Master Data Management and Data Governance.
- Expert Level knowledge of MS Excel, VBA, macros, MS project and other tools.
- Extensive experienced on ER Modeling, Dimensional Modeling (Star Schema, Snowflake Schema) and Data warehousing and OLAP tools.
- Expertise in data base programming (SQL, PLSQL) MS Access Oracle 12c/11g/10g/9i, XML, DB2, Informix, Teradata,, Data base tuning and Query optimization.
- Very good experience with R, Shiny R, Advance Excel, Power point etc.
- Expertise in performing data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Experienced in logical/physical database design and review sessions to determine and describe dataflow and data mapping from source to target databases coordinating with End Users, Business Analysts, DBAs and Application Architects.
- Expertise in Visio, Process Flow Diagrams, Activity Diagrams, Cross Functional Diagram, Swim Lane Diagrams, Use Case Diagrams.
- Expertise in scheduling JAD (Joint Application Development) with End Users, stake Holders, Subject Matter Experts, Developers and Testers.
- Expertise in Data modeling (Dimensional & Relational) concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.
- Expertise in writing Stored Procedures, Functions, Nested Functions, building Packages and developing Public and Private Sub-Programs using PL/SQL and providing Documentation.
- Expertise in loading data by using the Teradata loader connection, writing Teradata utilities scripts (Fastload, Multiload) and working with loader logs.
- Experienced in developing TSQL scripts and stored procedures to perform various tasks and multiple DDL, DML, and DCL activities to carry out business requirements
- Strong RDBMS concepts and well experience in creating database Tables, Views, Sequences, triggers, joins taking the Performance and Reusability into consideration.
- Very good knowledge of various SDLC Methodologies like Agile and Waterfall, experience project life cycle.
- Efficient in Extraction, Transformation and Loading (ETL) data from spread sheets, database tables using Microsoft data transformation service (DTS)
- Extensive knowledge in software testing methodology and developing Test Plans, Test Procedures, Test Case Design and Execution, Modification Requests.
- Strong in conceptualizing and communicating enterprise data architecture frameworks for global enterprises for inter operation of enterprise data warehouses, middleware, and web applications.
TECHNICAL SKILLS:
Data analysis: Process/Production Model analysis, Data Normalization, Cleansing, Profiling, System Design, Data architecture internal Standards development, Metadata, Reports, Source, and Target system Analysis
Data modeling tools: CA Erwin, Sybase Power designer 16.5, ER/Studio Data architect, MY SQL Work Bench, MS VISIO
Reporting Environment: SQL Server Management Studio, Microsoft SSIS, SSRS, Power BI, Tableau, HDFS, hands-on Sqoop & Map Reduce(Big Data), R language for statistical analysis(begun to expertise)
Database Systems: SQL Server 2008/2012, Oracle 11g/10g, Teradata, Hive (Hadoop), MS Access.
MS Office Suite: MS Word, MS PowerPoint, MS Excel, MS Visio, MS Project
Quality Analysis: Business and Software Process and Procedure Definition, Quality Models, data standards, Measures and Metrics, Project Reviews, Audits, and Assessments. Provide Assistance to Production Issues
PROFESSIONAL EXPERIENCE:
Confidential, Chicago, IL
Senior Data Modeler
Responsibilities:
- Design database, data models, ETL processes, data warehouse applications and business intelligence (BI) reports through the use of best practices and tools, including SQL, SSIS, SSRS and OLAP.
- Transformed Logical Data Model to Physical Data Model ensuring the Primary Key and Foreign Key relationships in PDM, Consistency of definitions of Data Attributes and Primary Index Considerations.
- Validated the data of reports by writing SQL queries in PL/SQL Developer against ODS.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Design of Redshift Data model, Redshift Performance improvements/analysis
- Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, HBase, Hive
- Involved with Data Analysis primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats
- Developed Star and Snowflake schemas based dimensional model to develop the data warehouse.
- Used Big Data Tools like Map Reduce, Hive SQL, Hive PL/SQL, Impala, Pig, Spark Core, YARN, SQOOP etc.
- Created ad-hoc reports to users in Tableau by connecting various data sources
- Performed Data modeling for existing Databases using Toad Data Modeler and Erwin.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Created Hive queries that helped analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Designed the ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and physical database (capacity planning, object creation and aggregation strategies) for Oracle and Teradata as per business requirements using Erwin
- Developing and automating multiple departmental Reports using Tableau Software and MS Excel.
- Involved with all the phases of Software Development Life Cycle (SDLC) methodologies throughout the project life cycle.
- Developed and initiated more efficient data collection and translate for analytics on Tableau.
- Extensively involved in the Physical/logical modeling and development of Reporting Data Warehousing System.
- Performing reverse engineering of physical data models from databases and SQL scripts.
- Extracted the data from MySQL, AWS RedShift into HDFS using Sqoop.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Utilized Oozie workflow to run Pig and Hive Jobs Extracted files from Mongo DB through Sqoop and placed in HDFS and processed
- Used Normalization (1NF, 2NF&3NF) and De-normalization techniques for effective performance in OLTP and OLAP systems.
- Created fully fledged Source to Target Mapping documents (S2T), documented business and transformation rules.
- Managed and reviewed Hadoop log files.
- Created Complex SQL Queries using Views, Indexes, Triggers, Roles, Stored procedures and User Defined Functions Worked with different methods of logging in SSIS.
- Monitor the scheduled Batch jobs for the execution of Loading Process in MDM.
- Implemented Installation and configuration of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.
- Worked in importing and cleansing of data from various sources like DB2, Oracle, flat files onto SQL Server with high volume data.
- Involved in Migrating the data model from one database to Teradata database and prepared a Teradata staging model.
- Implemented a proof of concept deploying this product in Amazon Web Services AWS.
- Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
- Collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing such data throughout an organization to ensure consistency trough MDM.
- Involved in data modeling and designing various objects and worked with various extraction mechanisms within the SAP R/3 (OLTP) to cater the reports at OLAP.
- Responsible for creating Hive tables, loading data and writing hive queries.
- Worked on different data formats such as Flat files, SQL files, Databases, XML schema, CSV files.
Environment: Erwin, Teradata V14, Teradata SQL Assistant, R, MDM, Informatica, Oracle 11g, Netezza, SQL Server 2008, Mainframes, SQL, Hive, MapReduce, Pig, MongoDB, AWS, Redshift, S3, EMR, PL/SQL, SAP, XML, Shiny, Hive, TOAD, Sqoop, Hadoop, PIG and HBase, Cognos 10.
Confidential, Mountain View, CA
Senior Data Modeler
Responsibilities:
- Suggested various changes in the physical model to support the business requirements.
- Developed server jobs to load the data from flat files, text files, tag text files and MS SQL.
- Utilized shared containers for code reusability and for implementing the predefined business logic.
- Created and scheduled the job sequences by checking job dependencies.
- Wrote complex SQL queries using joins, sub queries and correlated sub queries.
- Formulated procedures for integration of R programming plans with data sources and delivery systems.
- Wrote PL/SQL stored procedures, functions and packages and triggers to implement business rules into the application.
- Developed shell scripts to invoke back end SQL and PL/SQL programs
- Performed unit testing to check the validity of the data at each stage.
- Used Data Stage Director to debug the jobs and to view the error log to check for errors.
- Designed and implemented Enterprise Data warehouse which support multi-languages.
- Implemented best practices in the development environment (code standards, code migration).
- Used Informatica features to implement Type I & II changes in slowly changing dimension tables.
- Created and ran workflows and Worklets using Workflow Manager to load the data into the target database.
- Performance tuning of SQL Queries, Sources, Targets and sessions .
Environment: Erwin, Oracle, MS-SQL Server, Hive, NoSQL, DynamoDB, Teradata, Netezza, R, PL/SQL, MS-Visio, Informatica, T-SQL, SQL, IBM Cognos 10.2, Tableau, Shiny, Sqoop, Crystal Reports 2008, Java, Spark, HDFS, PIG, MapReduce, Hadoop eco system, AWS, S3, EMR, MongoDB, Hbase.
Confidential, Jacksonville, FL
Data Modeler
Responsibilities:
- Involved with Business Analysts team in requirements gathering and in preparing functional specifications and changing them into technical specifications.
- Interacting with the Business Users for gathering design requirements and taking feedback on improvements.
- Involved in understanding and creating Logical and Physical Data model using Erwin Tool.
- Guide the full lifecycle of a Hadoop solution, including requirements analysis, platform selection, technical architecture design, application design and development, testing, and deployment
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Creating logical and physical data models using best practices to ensure high data quality and reduced redundancy.
- Manage timely flow of business intelligence information to users.
- Involved in making screen designs, Use Cases and ER diagrams for the project using ERWIN and Visio.
- Extracted the data from MySQL, AWS RedShift into HDFS using Sqoop.
- Define Big Data strategy, including designing multi-phased implementation roadmaps.
- Extracting data from IBM Cognos to create automated visualization reports and dashboards on Tableau.
- Analyze the Business information requirements and research the OLTP source systems to identify the measures, dimensions and facts required for the reports.
- Performed Data mapping between source systems to Target systems, logical data modeling, created class diagrams and ER diagrams and used SQL queries to filter data
- Implement enterprise grade platform (mark logic) for ETL from mainframe to NoSQL (cassandra)
- Lead design of high-level conceptual and logical models that facilitate a cross-system/cross functional view of data requirements
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Worked with AWS to implement the client-side encryption as Dynamo DB does not support at rest encryption at this time.
- Maintaining conceptual, logical and physical data models along with corresponding metadata.
- Done data migration from an RDBMS to a NoSQL database, and gives the whole picture for data deployed in various data systems.
- Designed and developed the data dictionary and Meta data of the models and maintain them.
- Processed the data using HQL (like SQL) on top of Map-reduce.
- Involved in Data Warehouse Support - Star Schema and Dimensional modeling to help design data marts and Enterprise data warehouse
- Coordinated with DBA on data base build and table normalizations and de-normalizations
- Developed triggers, stored procedures, functions and packages using cursors and ref cursor concepts associated with the project using Pl/SQL
- Prepared documentation for all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, glossary evolve and change during the project
- Participated in the Master Data Management (MDM) effort; provide technical advice and support toward the development of strategic and tactical plans for client master data management strategy, data inventories, data governance, data management, storage, and distribution alternatives in support of client MDM strategy.
- Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Exported the patterns analyzed back to Teradata using Sqoop.
- Troubleshoot test scripts, SQL queries, ETL jobs, and Enterprise data warehouse/data mart/data store models.
- Responsible for different Data mapping activities from Source systems to Teradata
- Assist Business Objects & Tableau Report Developers to develop reports based on the requirements.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS
- Developed the performance tuning of the database by using EXPLAIN PLAN, TKPROF utilities and also debugging the SQL code.
Environment: Erwin, Informatica Power Center 8.1/9.1, MDM, R, Power Connect/ Power exchange, Oracle 11g, Main frames, Tableau, SAP, DB2 MS SQL Server 2008, SAP, Shiny, TOAD, SQL,PL/SQL, XML, Windows NT 4.0, Unix Shell Scripting.
Confidential, Fort Worth, TX
Data Modeler
Responsibilities:
- Created and maintained Logical and Physical models for the data mart. Created partitions and indexes for the tables in the data mart.
- Multiple phased projects to develop EDW (Enterprise Data Warehouse) and Data Marts to support Business Intelligence needs as per the requirements of the client.
- Performed data profiling and analysis applied various data cleansing rules designed data standards and architecture/designed the relational models.
- Maintained metadata (data definitions of table structures) and version controlling for the data model.
- Gathering, reviewing business requirements and Analyzing data sources from Excel/SQL for design. Development, testing, and production rollover of reporting and analysis projects within Tableau Desktop.
- Prepared scripts to ensure proper data access, manipulation and reporting functions with R programming languages.
- Developed SQL scripts for creating tables, Sequences, Triggers, views and materialized views
- Worked on query optimization and performance tuning using SQL Profiler and performance monitoring.
- Assesses and determines governance, stewardship, and frameworks for managing data across the organization.
- Utilized Erwin's forward/reverse engineering tools and target database schema conversion process.
- Created Custom complex SQL Queries to generate the data from data base and use it for making Innovate and Powerful Dashboards for easy Data Visualization in Tableau
- Use R, SAS and SPSS to create Calculated/Derived Variables for
Clinical and Claims Data
- Worked on creating enterprise wide Model EDM for products and services in Teradata Environment based on the data from PDM. Conceived, designed, developed and implemented this model from the scratch.
- Write SQL scripts to test the mappings and Developed Traceability Matrix of Business Requirements mapped to Test Scripts to ensure any Change Control in requirements leads to test case update.
- Involved in extensive DATA validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Created SQL scripts to find data quality issues and to identify keys, data anomalies, and data validation issues.
- Designed and implemented Enterprise Metadata Repository used as enterprise standard.
- Developed and executed SQL's, procedures, functions, cursors and packages to implement the data migration in Siebel and Informatica MDM.
- Used Graphical Entity-Relationship Diagramming to create new database design via easy to use, graphical interface.
- Scheduled data loads to BW from SAP R/3, Monitored data loads which are running hourly, daily, weekly & monthly basis and solving errors when occurred.
- Working on load, replicate, stop, suspend and resume operations while replicating data into SAP HANA using HANA Data provisioning.
- Used data analysis techniques to validate business rules and identified low quality missing data in the existing Enterprise Data Warehouse (EDW)
- Prepare Data Visualization reports for the management using R (Statistical analysis tool )
- Filtered/sorted data based on column entries using Excel/VBA.
- Synthesized and translated Business data needs into creative visualizations in Tableau
- Designed different type of STAR schemas for detailed data marts and plan data marts in the OLAP environment.
Environment: Oracle 9i, MS SQL Server, PL/SQL, Toad, Tableau, UNIX Shell Scripting.
Confidential, Chicago, IL
Data Analyst
Responsibilities:
- Worked with business users to gather requirements and create data flow, process flows and functional specification documents.
- Designed & Created Test Cases based on the Business requirements (Also referred Source to Target Detailed mapping document & Transformation rules document)
- Created enterprise level conceptual data model, logical data model and dimensional model (star schema) to support first phase of new target architecture (credit swaps)
- Developed, enhanced and maintained Snow Flakes Schemas within Enterprise data warehouse and data mart with conceptual data models.
- Involved in extensive Data validation using SQL queries and back-end testing
- Performed rule based Profiling on the client business data in understanding the depth of accuracy and duplicates which are needed for MDM implementation.
- Developing the interactive web application using Shiny under R Studio.
- Used excel sheet, flat files, CSV files to generated Tableau adhoc reports
- Used SQL for Querying the database in UNIX environment
- Involved in data analysis and creating data mapping documents to capture source to target transformation rules.
- Used Erwin and Visio to create 3NF and dimensional data models and published to the business users and ETL / BI teams.
- Involved in Data mapping specifications to create and execute detailed system test plans. The data mapping specifies what data will be extracted from an internal data warehouse, transformed and sent to an external entity.
- Creating or modifying the T-SQL queries as per the business requirements.
- Created and maintained Logical and Physical models for the data mart, which supports the Credit, Fraud and Risk Retail reporting for credit card portfolio.
- Using Erwin modeling tool, publishing of a data dictionary, review of the model and dictionary with subject matter experts and generation of data definition language.
- Maintaining Reconciliation's to verify the data in SAP R/3 and SAP BW are same or not after the data loads.
- Generated tableau dashboards for sales with forecast and reference lines
- Managed full SDLC processes involving requirements management, workflow analysis, source data analysis, data mapping, metadata management, data quality, testing strategy and maintenance of the model.
- Wrote complex SQL queries for validating the data against different kinds of reports generated by Business Objects.
- Developed auto generated mail features using UNIX scripting to notify MDM Production Support group for any Batch job failure incidents.
- Analysis of functional and non-functional categorized data elements for data profiling and mapping from source to target data environment. Developed working documents to support findings and assign specific tasks.
- Multidimensional visualization and analysis of the results using tools like Excel PowerPoint and Tableau.
- Involved in fixing invalid mappings, testing of Stored Procedures and Functions, Unit and Integrating testing of Informatica Sessions, Batches and the Target Data.
- Develop multiple regression analysis and build predictive model for identification of a given gene (as functional or non-functional) using SAS, R, STATISTICA, MATLAB, MINITAB etc.
- Created SQL tables with referential integrity and developed SQL queries using SQL Server and Toad.
- Involved in the validation of the OLAP Unit testing and System Testing of the OLAP Report Functionality and data displayed in the reports.
Environment: MS SQL Server 2005, DB2, Oracle SQL Developer, R, MDM, PL/SQL, Business Objects, Erwin 7.0.x, Tableau, MS office suite, SAP, Windows XP, TOAD, Shiny, SQL*PLUS, SQL*LOADER