We provide IT Staff Augmentation Services!

Sr. Data Modeler/analyst Resume

4.00/5 (Submit Your Rating)

Chicago, IL

SUMMARY

  • About 7+ years of experience in Leading Data Analytics and design of Data processing, Data warehousing, Data Quality, Data Analytics & Business Intelligence development projects with complete end - to-end SDLC process.
  • Experience in Design and Development of large Enterprise Data Warehouse (EDW) and Data-marts for target user-base consumption.
  • Experienced in designing many key system Architecture's along with integration of many modules and systems including Big-Data Hadoop systems, AWS with hardware sizing, estimates, benchmarking and data architecture.
  • Expert in writing SQL queries and optimizing the queries in Oracle 10g/11g/12c, DB2, Netezza, SQL Server 2008/2012/2016 and Teradata 13/14.
  • Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata and worked on Teradata SQL queries, Teradata Indexes, Utilities such as Mload, Tpump, Fast load and Fast Export.
  • Good experience in Data Profiling, Data Mapping, Data Cleansing, Data Integration, Data Analysis, Data Quality, Data Architecture, Data Modelling, Data governance, Metadata Management & Master Data Management.
  • Expertise in Data Modeling, created various Conceptual, Logical and Physical Data Models for DWH projects. Created first of a kind unique Data Model for an Intelligence Domain.
  • Excellent knowledge on Perl & UNIX and expertise lies in Data Modeling, Database design and implementation of Oracle, AWS Redshift databases and Administration, Performance tuning etc.
  • Experienced in analyzing data using Hadoop Ecosystem including HDFS, Hive, Spark, Spark Streaming, Elastic Search, Kibana, Kafka, HBase, Zookeeper, PIG, Sqoop, and Flume.
  • Experienced working with Excel Pivot and VBA macros for various business scenarios and involved in data Transformation using Pig scripts in AWS EMR, AWS RDS.
  • Experienced in designing & implementing many projects using various set of ETL/BI tools involving the latest features & product trends, experience in technologies such as Big-data, Cloud Computing (AWS) & In-memory Apps.
  • Experience working with data modeling tools like Erwin, Power Designer and ER Studio.
  • Experienced in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS and experience in data analysis using Hive, PigLatin, and Impala.
  • Experienced in continuous Performance Tuning, System Optimization & many improvements for BI/OLAP systems & traditional Databases such as Oracle, SQL, DB2 and many high-performance databases.
  • Well versed in Normalization / De-normalization techniques for optimum performance in relational and dimensional database environments and implemented various data warehouse projects in Agile Scrum/Waterfall methodologies.
  • Successfully installed and upgraded lower versions to higher versions & Migration Projects, Worked on Support & Maintenance application systems thereby ensuring all deliveries are met on time.
  • Expertise in writing SQL queries and optimizing the queries in Oracle, SQL Server 2008/12/16 and Teradata and involved in developed and managing SQL, Python code bases for data cleansing and data analysis using Git version control.
  • Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and scheduling.
  • Extensive ETL testing experience using Informatica (Power Center/ Power Mart) (Designer, Workflow Manager, Workflow Monitor and Server Manager)
  • Have good exposure on working in offshore/onsite model with ability to understand and/or create functional requirements working with client and also have Good experience in requirement analysis and generating test artifacts from requirements docs.

TECHNICAL SKILLS

Analysis and Modeling Tools: Erwin 9.6/9.5/9.1, Oracle Designer, ER/Studio.

Languages: SQL, Python, T-SQL

Database Tools: Microsoft SQL Server 2016/2014/2012 , Teradata 15/14, Oracle 12c/11g/10g, MS Access, Poster SQL, Netezza, DB2, HBase, MongoDB and Cassandra

ETL Tools: SSIS, Informatica Power 9.6/9.5 and SAP Business Objects.

Cloud: AWS, MS Azure, AWS S3, AWS EC2, AWS EMR, AWS RDS and AWS Glue.

Operating System: Windows, Dos and UNIX.

Reporting Tools: SSRS, Business Objects, Crystal Reports.

Tools: & Software's: TOAD, MS Office, BTEQ, Teradata SQL Assistant and Netezza Aginity

Big Data Technologies: Hadoop, HDFS, Hive, MapReduce, Pig, HBase, Sqoop, Flume, Oozie and No SQL Databases

PROFESSIONAL EXPERIENCE

Sr. Data Modeler/Analyst

Confidential, Chicago IL

Responsibilities:

  • Performed System Analysis & Requirements Gathering related to Architecture, ETL, Data Quality, MDM, Dashboards and Reports. Captured enhancements from various entities and provided Impact analysis.
  • Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts. Ingest data into Hadoop (HDFS) /Hive from different data sources.
  • Responsible for Big data initiatives and engagement including analysis, brainstorming, POC, and architecture.
  • Designed Real time Stream processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning.
  • Designed the Logical Data Model using ERWIN 9.64 with the entities and attributes for each subject areas.
  • Involved in Dimensional modeling (Star Schema) of the Data warehouse and used Erwin to design the business process, dimensions and measured facts and developed and maintained an Enterprise Data Model (EDM) to serve as both the strategic and tactical planning vehicles to manage the enterprise data warehouse.
  • Enhancements to traditional data warehouse based on STAR schema, update data models, perform Data Analytics and Reporting using Tableau and extracted the data from MySQL, AWS into HDFS using Sqoop.
  • Developed long term data warehouse roadmap and architectures, designs and builds the data warehouse framework per the roadmap.
  • Working on AWS and architecting a solution to load data create data models and run BI on it and developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
  • Involved in building database Model, APIs and Views utilizing Python, in order to build an interactive web based solution.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Working on logical and physical modeling, and ETL design for manufacturing data warehouse applications.
  • Responsible for Querying on Hive or Impala databases on Hue platform for analyzing data, creating tables, define schema and improve query performance by implementing partition on tables in Metastore.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries Developed Hive queries to process the data and generate the data cubes for visualizing Implemented
  • Selecting the appropriate AWS service based on data, compute, database, or security requirements and defined and deployed monitoring, metrics, and logging systems on AWS.
  • Designed and developed a Data Lake using Hadoop for processing raw and processed claims via Hive and Informatica.
  • Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
  • Implemented Join optimizations in Pig using Skewed and Merge joins for large datasets schema and developed and implemented different Pig UDFs to write ad-hoc and scheduled reports as required by the Business team.
  • Creating dimensional data models based on hierarchical source data and implemented on Teradata achieving high performance without special tuning.
  • Involved in designing Logical and Physical data models for different database applications using the Erwin and involved in Data modeling, Design, implement, and deploy high-performance, custom applications at scale on Hadoop /Spark.
  • Applied data analysis, data mining and data engineering to present data clearly and reverse engineered some of the databases using Erwin.
  • Working on AWS and architecting a solution to load data create data models and run BI on it.
  • Developed automated data pipelines from various external data sources (web pages, API etc) to internal Data Warehouse (SQL server, AWS), then export to reporting tools like Datorama by Python.
  • Worked on AWS utilities such as EMR, S3 and Cloud watch to run and monitor jobs on AWS.
  • Involved in loading data from LINUX file system to HDFS Importing and exporting data into HDFS and Hive using Sqoop Implemented Partitioning, Dynamic Partitions, and Buckets in Hive.
  • Working on defining data architecture for data warehouses, Data marts and business applications.

Technology: Erwin 9.64, Python, Oracle 12c, Sqoop, Kafka, Hive, H-Base, PySpark, Scala, Greenplum, Teradata, MS SQL, Apache Cassandra, Impala, Cloudera, AWS, AWS EMR, Redshift, Flume, Apache Hadoop, Informatica Data Quality, Informatica Metadata Manager, Map Reduce, Cassandra, Zookeeper, AWS, MySQL, Dynamo DB, SAS, SAP BO, Tableau, PL/SQL and Python.

Sr. Data Modeler / Analyst

Confidential, Bloomington, IN

Responsibilities:

  • Responsible for defining Data Architecture Standards on Teradata / Hadoop Platforms and defined specific process specific zones for standardizing data and loading into target tables. Also responsible for defining audit columns for each specific zone in Hadoop environment.
  • Performed as a Data Analysis, Data Modeling, Data Migration and data profiling using complex SQL on various sources systems including Oracle and Teradata.
  • Managed Logical and Physical Data Models in ER Studio Repository based on the different subject area requests for integrated model. Developed Data Mapping, Data Governance, and transformation and cleansing rules involving OLTP, ODS.
  • Written Python Scripts, mappers to run on Hadoop distributed file system (HDFS) and performed troubleshooting, fixed and deployed many Python bug fixes of the two main applications that were a main source of data for both customers and internal customer service team.
  • Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, HBase, and Hive.
  • Worked with project management, business teams and departments to assess and refine requirements to design/develop BI solutions using MS Azure.
  • Enforced referential integrity in the OLTP data model for consistent relationship between tables and efficient database design.
  • Full life cycle of Data Lake, Data Warehouse with Big data technologies like Spark, Hadoop and designed and deployed scalable, highly available, and fault tolerant systems on Azure.
  • Establish and develop new database maintenance scripts -Automate Netezza database management and monitoring.
  • Integrated NoSQL database like Hbase with Map Reduce to move bulk amount of data into HBase and Involved in loading and transforming large sets of data and analyzed them by running Hive queries.
  • Designed Star and Snowflake Data Models for Enterprise Data Warehouse using ER Studio and developed Data Model -Conceptual/l Logical/ Physical DM for ODS & Dimensional delivery layer in Azure SQL Data Warehouse
  • Utilized Informatica toolset (Informatica Data Explorer, and Informatica Data Quality) to analyze legacy data for data profiling.
  • Performed database health checks and tuned the databases using Teradata Manager and used MapReduce, and "Big data" work on Hadoop and other NOSQL platforms.
  • Developed, managed and validated existing data models including logical and physical models of the Data Warehouse and source systems utilizing a 3NF model.
  • Implemented logical and physical relational database and maintained Database Objects in the data model using ER Studio and used Star schema and Snowflake Schema methodologies in building and designing the Logical Data Model into Dimensional Models.
  • Involved in debugging and Tuning the PL/SQL code, tuning queries, optimization for the Oracle, and DB2 database.
  • Worked with data investigation, discovery and mapping tools to scan every single data record from many sources and involved in setting up and maintaining NoSQL Databases like Cassandra and HBase.
  • Developed several behavioral reports and data points creating complex SQL queries and stored procedures using SSRS and Excel.
  • Generated periodic reports based on the statistical analysis of the data using SQL Server Reporting Services (SSRS) and generated reports using Global Variables, Expressions and Functions using SSRS.
  • Developed different kind of reports such as Drill down, Drill through, Sub Reports, Charts, Matrix reports, Parameterized reports and Linked reports using SSRS.
  • Validated the data of reports by writing SQL queries in PL/SQL Developer against ODS and involved in user training sessions and assisting in UAT (User Acceptance Testing).
  • Involved in developing advanced ANSI SQL queries to extract, manipulate, and/or calculate information to fulfill data and reporting requirements including identifying the tables and columns from which data is extracted.

Technology: ER Studio 16.5, Oracle 12c, Python, MS Azure, Shell Scripting, UNIX, Hadoop, Hive, PIG, MongoDB, Cassandra, MapReduce, Windows7, SQL, PL/SQL, T-SQL, Datastage, UNIX, Agile, SSAS, Informatica, MDM, Teradata, MS Excel, MS Access, Metadata, SAS, SQL Server, Tableau, Netezza, ERP, SSRS, Teradata SQL Assistant, DB2, Netezza Aginity.

Data Modeler/ Data Analyst

Confidential - Chicago, IL

Responsibilities:

  • Gather and analyze business data requirements and model these needs. In doing so, work closely with the users of the information, the application developers and architects, to ensure the information models are capable of meeting their needs.
  • Coordinated with Data Architects on AWS provisioning EC2 Infrastructure and deploying applications in Elastic load balancing.
  • Performed Business Area Analysis and logical and physical data modeling for a Data Warehouse utilizing the Bill Inmon Methodology and also designed Data Mart application utilizing the Star Schema Dimensional Ralph Kimball methodology.
  • Worked on AWS utilities such as EMR, S3 and Cloud watch to run and monitor jobs on AWS
  • Designed and Developed logical & physical data models and Meta Data to support the requirements using Erwin
  • Designed the ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and physical database (capacity planning, object creation and aggregation strategies) for Oracle and Teradata as per business requirements using Erwin
  • Worked on multiple Data Marts in Enterprise Data Warehouse Project (EDW) and involved in designing OLAP data models extensively used slowly changing dimensions (SCD).
  • Designed 3rd normal form target data model and mapped to logical model and involved in extensive DATA validation using ANSI SQL queries and back-end testing
  • Generated DDL statements for the creation of new ERwin objects like table, views, indexes, packages and stored procedures.
  • Design MOLAP/ROLAP cubes on Teradata Database using SSAS and used SQL for Querying the database in UNIX environment and creation of BTEQ, Fast export, Multi Load, TPump, Fast load scripts for extracting data from various production systems
  • Developed automated procedures to produce data files using Microsoft Integration Services (SSIS) and performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Netezza
  • Worked RDS for implementing models and data on RDS.
  • Developed mapping spreadsheets for (ETL) team with source to target data mapping with physical naming standards, data types, volumetric, domain definitions, and corporate meta-data definitions.
  • Used CA Erwin Data Modeler (Erwin) for Data Modeling (data requirements analysis, database design etc.) of custom developed information systems, including databases of transactional systems and data marts.
  • Identified and tracked the slowly changing dimensions (SCD I, II, III & Hybrid/6) and determined the hierarchies in dimensions.
  • Worked on data integration and workflow application on SSIS platform and responsible for testing all new and existing ETL data warehouse components.
  • Designing Star schema and Snow Flake Schema on Dimensions and Fact Tables and worked with Data Vault Methodology Developed normalized Logical and Physical database models.
  • Transformed Logical Data Model to Physical Data Model ensuring the Primary Key and Foreign key relationships in PDM, Consistency of definitions of Data Attributes and Primary Index considerations.
  • Generated various reports using SQL Server Report Services (SSRS) for business analysts and the management team and wrote and running SQL, BI and other reports, analyzing data, creating metrics/dashboards/pivots/etc.
  • Working along with ETL team for documentation of transformation rules for data migration from OLTP to warehouse for purpose of reporting.
  • Involved in writing T-SQL working on SSIS, SSRS, SSAS, Data Cleansing, Data Scrubbing and Data Migration.

Technology: SQL Server 2012, Erwin9.1, Oracle, AWS EC2, AWS RDS, Informatica, RDS, JDBC, NOSQL, Spark, Scala, Python, MySQL, PostgreSQL, Teradata, SSRS, SSIS, SQL, DB2, Shell Scripting, Tableau, Excel, MDM, Agile.

Data Analyst

Confidential, India

Responsibilities:

  • Attended and participated in information and requirements gathering sessions and translated business requirements into working logical and physical data models for Data Warehouse, Data marts and OLAP applications.
  • Performed extensive Data Analysis and Data Validation on Teradata and designed Star and Snowflake Data Models for Enterprise Data Warehouse using ERWIN.
  • Created and maintained Logical Data Model (LDM) for the project includes documentation of all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, glossary terms, etc.
  • Integrated data from various Data sources like MS SQL Server, DB2, Oracle, Netezza and Teradata using Informatica to perform Extraction, Transformation, loading (ETL processes) Worked on ETL development and Data Migration using SSIS and (SQL Loader, PL/SQL).
  • Created Entity/Relationship Diagrams, grouped and created the tables, validated the data, identified PKs for lookup tables.
  • Involved in Designed and Developed logical & physical data models and Meta Data to support the requirements using ERWIN.
  • Involved using ETL tool Informatica to populate the database, data transformation from the old database to the new database using Oracle.
  • Involved in modeling (Star Schema methodologies) in building and designing the logical datamodel into Dimensional Models and Performance query tuning to improve the performance along with index maintenance.
  • Involved in the creation, maintenance of Data Warehouse and repositories containing Metadata.
  • Wrote and executed unit, system, integration and UAT scripts in a Data Warehouse projects.
  • Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data Warehouse, and data mart reporting system in accordance with requirements.
  • Responsible for Creating and Modifying T-SQL stored procedures/triggers for validating the integrity of the data.
  • Worked on Data Warehouse concepts and dimensional data modelling using Ralph Kimball methodology.
  • Created number of standard reports and complex reports to analyze data using Slice & Dice and Drill Down, Drill through using SSRS.
  • Developed separate test cases for ETL process (Inbound & Outbound) and reporting.

Technology: Oracle, MS Visio, PL-SQL, Microsoft SQL Server, SSRS, T-SQl, Rational Rose, Data warehouse, OLTP, OLAP, ERWIN, Informatica 9.x, Windows, SQL, PL/SQL, SQL Server, Talend Data Quality, Oracle 9i/10g, Flat Files, Windows, SVN.

We'd love your feedback!