Sr. Data Architect/data Modeler Resume
Princeton, NJ
SUMMARY:
- Above 11+ years of experience in Information Technology with expertise in Data Architecture, Data modeling/Data Analysis for Data Warehouse/Data Mart development, Online Transaction Processing (OLTP) and Data Warehousing (OLAP) Business Intelligence (BI) applications.
- Experience in Agile methodology/Scrum and water fall models of DW complete life cycle projects
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS.
- Expertise in developing solutions around NOSQL databases like MongoDB and HBase.
- Extensive expertise in Data Warehousing on different database(s), as well as data modeling, both logical and physical data modeling tools like Erwin, Power Designer and ER Studio.
- Skills in scientific application design and development and scientific data processing related technologies: VBA, Pipeline Pilot, R, Python, or web service integrations.
- Profound experience on SSIS, SSRS, SSAS deployment's and maintenance of L1/L3 support with required service and proxies and granular level permissions of regional security levels.
- Experience with application development languages and tools like Java, Html,& CSS
- Experienced on projects utilizing Hadoop eco systems, and other ecosystem tools like HDFS, Map Reduce, Sqoop, Hive, Pig, Flume, Oozie.
- Extensive experience in developing and driving strategic direction of Master Data Management (MDM Have AWS Redshift db design and development and AWS S3 development experience.
- Experience in Big Data Hadoop Ecosystem in ingestion, storage, querying, processing and analysis of big data.
- Experienced in big data analysis and developing data models using Hive, PIG, and Map reduce, SQL with strong data architecting skills designing data - centric solutions
- Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.
- Experienced in integration of various relational and non-relational sources such as DB2, Teradata, Oracle, Netezza, SQL Server, XML and Flat Files, to Netezza database.
- Experienced using MapReduce and Big data work on Hadoop and other NoSQL platforms
- Expertise in designing the architecture of the Extract, Transform, Load environment and development of transformation standards and processes using ETL best practices
- Experienced in various Teradata utilities like Fastload, Multiload, BTEQ, and Teradata SQL Assistant.
- Experience in BI/DW solution (ETL, OLAP, Data mart), Informatica, BI Reporting tool like Tableau and QlikView and also experienced leading the team of application, ETL, BI developers, Testing team
- Extensive ETL testing experience using Informatica 9x/8x, Talend, Pentaho.
- Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export
- Strong experience with architecting highly performs databases using PostgreSQL, PostGIS, MYSQL and Cassandra.
- Experience in designing Enterprise Data Warehouses, Data Marts, Reporting data stores (RDS) and Operational data stores (ODS).
- Designed architecture to meet Data governance ( Master Data, Meta Data, Data Quality and Data Security) process and ensure the data standards and policies are being met
- Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and scheduling.
- Work on Background process in Oracle Architecture also drill down to the lowest levels of systems design and construction.
- Experienced in Batch processes, Import, Export, Backup, Database Monitoring tools and Application support.
- Develop and manage SQL, Python and R code bases for data cleansing and data analysis using GIT version control
- Excellent experience in writing SQL queries to validate data movement between different layers in data warehouse environment.
TECHNICAL SKILLS:
Big Data technologies: MapReduce, HBase, HDFS, Sqoop, Spark, Hadoop, Hive, PIG, Impala.
Data Modeling Tools: ER/Studio 9.7/9.0, Erwin 9.6/9.5, Power Sybase Designer.Cloud Architecture
OLAP Tools: Tableau, SAP BO, SSAS, Business Objects, and Crystal Reports 9/7
Programming Languages: SQL, PL/SQL, UNIX shell Scripting, PERL, AWK, SED
Databases: Oracle 12c/11g, Teradata R15/R14, MS SQL Server 2016/2014, DB2.
Testing and defect tracking Tools: HP/Mercury (Quality Center, Win Runner, Quick Test Professional, Performance Center, Requisite, MS Visio & Visual Source Safe
Operating System: Windows, Unix, Sun Solaris
ETL/Data warehouse Tools: Informatica 9.6/9.1, SAP Business Objects XIR3.1/XIR2, Talend, Tableau, Pentaho.
Methodologies: Agile, RAD, JAD, RUP, UML, System Development Life Cycle (SDLC), Ralph Kimball and Bill Inmon's, Waterfall Model.
PROFESSIONAL EXPERIENCE:
Confidential - Princeton, NJ
Sr. Data Architect/Data Modeler
Responsibilities:
- Designed architecture collaboratively to develop methods of synchronizing data coming in from multiple source systems
- Researched, evaluated, architect, and deployed new tools, frameworks, and patterns to build sustainable Big Data platforms for our clients.
- Implemented Agile Methodology for building Integrated Data Warehouse, involved in multiple sprints for various tracks throughout the project lifecycle.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Worked on Normalization and De-normalization concepts and design methodologies like Ralph Kimball and Bill Inmon's Data Warehouse methodology.
- Developed prototype solutions to verify capabilities for new systems development, enhancement, and maintenance of MDM
- Reviewed the Conceptual EDW (Enterprise Data Warehouse) Data Model with Business Users, App Dev. and Information Architects to make sure all the requirements are fully covered.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
- Involved in several facets of MDM implementations including Data Profiling, Metadata acquisition and data migration.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snow Flake Schemas.
- Involved in Normalization/De-normalization techniques for optimum performance in relational and dimensional database environments.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop.
- Developed ETL processes extracted data daily and loaded data into a SSIS based Decision Support Warehouse.
- Responsible for Metadata Management, keeping up to date centralized metadata repositories using Erwin modeling tools.
- Driven the technical design of AWS solutions by working with customers to understand their needs.
- Conducted numerous POCs (Proof of Concepts) to efficiently import large data sets into the database from AWS S3 Bucket.
- Worked on analyzing source systems and their connectivity, discovery, data profiling and data mapping.
- Driven the technical design of AWS solutions by working with customers to understand their needs
- Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from Teradata database.
- Collected large amounts of log data using Apache Flume and aggregating using PIG in HDFS for further analysis.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Designed and architecting AWS Cloud solutions for data and analytical workloads such as warehouses, Big Data, data lakes, real-time streams and advanced analytics
- Interacted with End-users for gathering Business Requirements and Strategizing the Data Warehouse processes
- Write complex Netezza views to improve performance and push down the load to database rather than doing it in the ETL tool.
- Involved in data model reviews with internal data architect, business analysts, and business users with explanation of the data model to make sure it is in-line with business requirements.
- Created DDL scripts using ER Studio and source to target mappings to bring the data from source to the warehouse.
- Worked with MapReduce frameworks such as Hadoop and associated tools (pig, Sqoop, etc)
- Used ETL methodology for supporting data extraction, transformations and loading processing, in a complex MDM using Informatica.
- Generated the frame work model from IBM data Architect for the Cognos reporting team.
Environment: ER Studio V17, Netezza, Sql Server 2016, Taradata15, OLAP, OLTP, UNIX, MDM, Hadoop, Hive, Pig, HBase, HDFS, SAP, AWS, Redshift, EMR, S3, Apache Flume, Ralph Kimball and Bill Inmon's, PL/SQL, BTEQ, Python.
Confidential - Broadway, NY
Sr. Data Architect/Data Modeler
Responsibilities:
- Massively involved in Data Architect role to review business requirement and compose source to target data mapping documents.
- Involved in relational and dimensional Data Modeling for creating Logical and Physical design of the database and ER diagrams using data modeling like Erwin.
- Created the data model for the Subject Area in the Enterprise Data Warehouse (EDW).
- Applied Data Governance rules (primary qualifier, class words and valid abbreviation in Table name and Column names).
- Worked on AWS Redshift and RDS for implementing models and data on RDS and Redshift.
- Involved in several facets of MDM implementations including Data Profiling, Metadata acquisition and data migration.
- Worked on NoSQL databases including HBase, Mongo DB, and Cassandra. Implemented multi-data center and multi-rack Cassandra cluster.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture
- Worked on Normalization and De-Normalization techniques for both OLTP and OLAP systems.
- Worked closely with Product Owner and Solution Architects to find out the optimal solution for the Database design from and ETL and Data Governance perspective.
- Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
- Performed Data mapping between source systems to Target systems, logical data modeling, created class diagrams and ER diagrams and used SQL queries to filter data
- Loaded data into Hive Tables from Hadoop Distributed File System ( HDFS) to provide SQL access on Hadoop data
- Automation of SSIS Packages for production deployment with xml configurations.
- Developed Historical/Incremental of SSIS Packages with SCD2 concept of Star Schema.
- Involved in OLAP model based on Dimension and FACTS for efficient loads of data based on Star Schema structure on levels of reports using multi-dimensional models such as Star Schemas and Snowflake Schema
- Involved in Teradata utilities (BTEQ, Fast Load, Fast Export, Multiload, and Tpump) in both Windows and Mainframe platforms.
- Designed both 3NF Data models for DB2 systems and dimensional Data models using Star and Snowflake Schemas.
- Extensively used Metadata & Data Dictionary Management, Data Profiling and Data Mapping.
- Involved in Performance Tuning in SSIS packages by using Row, Block and Unblock Transformations.
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas.
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance with requirements.
- Conducted design walk through sessions with Business Intelligence team to ensure that reporting requirements are met for the business.
- Performed Extracting, Transforming and Loading (ETL) data from Flat file, MS SQL Server by using SSIS services
- Involved in debugging and Tuning the PL/SQL code, tuning queries, optimization for the Sql database.
- Lead data migration from legacy systems into modern data integration frameworks from conception to completion.
- Worked on Tableau for insight reporting and data visualization
- Extracted data from IBM Cognos to create automated visualization reports and dashboards on Tableau.
- Developed and implemented data cleansing, data security, data profiling and data monitoring processes.
- Generated DDL and created the tables and views in the corresponding architectural layers.
- Facilitated in developing testing procedures, test cases and User Acceptance Testing (UAT).
- Created PL/SQL Tables, Collections, and Records, Partitions and Used Dynamic SQL and triggers for faster data access and to in corporate business logic.
- Extensively worked with Developers and ETL team in enhancing the models and co-ordinate the same with DBA in implementing those changes into Applications or Databases.
- Conducted meetings with the BA to find answers to questions raised by offshore team and also the questions I had while doing data architecture part.
Environment: ERWIN r9.5, Congos, Sql Server2016, DB2, SSIS, OLAP, OLTP, LINUX, MDM, Hadoop, Hive, Pig, HBase, SAP, AWS, Redshift, PL/SQL, ETL.
Confidential - McLean, VA
Sr. Data Modeler /Data Analyst
Responsibilities:
- Designed the procedures for getting the data from all systems to Data Warehousing system. The data was standardized to store various Business Units in tables.
- Handled data from many sources using Excel and SQL queries
- Managed internal and external reporting requests while helping automate processes using data from multiple sources to an internal data warehouse
- Assisted in data review of service failure investigations, providing this information to the Quality Manager for processing.
- Improved performance on SQL queries used Explain plan / hints /indexes for tuning created DDL scripts for database. Created PL/SQL Procedures and Triggers.
- Performed data mining on data using very complex SQL queries and discovered pattern.
- Used SQL for Querying the database in UNIX environment.
- Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.
- Wrote and executed unit, system, integration and UAT scripts in a data warehouse projects.
- Responsible for different Data mapping activities from Source systems to Teradata
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data warehouse, data mart reporting system in accordance with requirements.
- Involved with Teradata utilities like Fastload, BTEQ, Multiload and Teradata SQL Assistant.
- Created a high level industry standard, generalized data model to convert it into logical and physical model at later stages of the project using ERWIN
- Designed and developed Use Cases, Activity Diagrams, Sequence Diagrams, OOD (Object oriented Design) using UML and Visio.
- Created a logical design and physical design in ER Studio.
- Created DDL scripts using ER Studio and source to target mappings to bring the data from source to the warehouse.
- Performed Data Analysis and data profiling using complex SQL on various sources systems including Oracle 10g and Teradata, to ensure accuracy of the data between the warehouse and source systems.
- Helped the testing team in creating the test plans and test scripts. Assisted the users in UAT testing by providing test scenarios and test data.
Environment: ER Studio8.0, OOD, OLAP, OLTP, Teradata 13, MS Excel, Oracle 10g, SQL, PL/SQL, MS Visio, SQL*Loader.
Confidential - Milwaukee, WI
Sr. Data Analyst/Data Modeler
Responsibilities:
- Developed logical data models and physical database design and generated database schemas using Erwin 8.0.
- Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and MS SQL Server.
- Document all data mapping and transformation processes in the Functional Design documents based on the business requirements.
- Prepared High Level Logical Data Models using Erwin, and later translated the model into physical model using the Forward Engineering technique.
- Generated and DDL (Data Definition Language) scripts using Erwin and assisted DBA in Physical Implementation of data Models.
- Translated business requirements into working logical and physical data models for OLTP & OLAP systems.
- Generated SQL scripts and implemented the relevant databases with related properties from keys, constraints, indexes & sequences.
- Used Reverse Engineering to connect to existing database and developed process methodology for the Reverse Engineering phase of the project.
- Developed the batch program in PL/SQL for the OLTP processing and used Unix Shell scripts to run in corn tab.
- Performed extensive data profiling and data analysis for detecting and correcting inaccurate data from the databases and track the data quality.
- Provided guidance and solution concepts for multiple projects focused on data governance and master data management.
- Created DDL scripts using Erwin and source to target mappings to bring the data from source to the warehouse.
- Designed and developed SAS macros, applications and other utilities to expedite SAS Programming activities.
- Involved in writing T-SQL working on SSIS, SSRS, SSAS, Data Cleansing, Data Scrubbing and Data Migration.
- Analyzed and Gathered requirements from business people and management and business requirement document to prioritize their needs.
- Responsible for backing up the data and involved in writing stored procedures and involved in writing ad-hoc queries for the data mining.
- Developed and maintained data dictionary to create metadata reports for technical and business purpose.
- Create and Monitor workflows using workflow designer and workflow monitor.
- Involved in extensive DATA validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Used SSRS for generating Reports from Databases and Generated Sub-Reports, Drill down reports, Drill through reports and parameterized reports using SSRS.
- Developed PL/SQL scripts to validate and load data into interface tables and Involved in maintaining data integrity between Oracle and SQL databases.
- Heavily worked on SQL query optimization also tuning and reviewing the performance metrics of the queries
- Performed the Data Mapping, Data design (Data Modeling) to integrate the data across the multiple databases in to EDW.
- Collaborated with the Relationship Management and Operations teams to develop and present KPIs to top-tier clients.
Environment: Erwin 8.0, Oracle 9i, MS SQL Server 2008, SSRS, OLAP, OLTP, MS Excel, Flat Files,, PL/SQL, OLAP, OLTP, SQL, IBM Cognos, Tableau.
Confidential
Data Analyst/Data Modeler
Responsibilities:
- Part of the team responsible for the analysis, design and implementation of the business solution.
- Coordinated with DBA on database build and table normalizations and de-normalizations.
- Maintained the quality of data analysis, researched output and reporting, and ensured that all deliverables met specified requirements
- Data Vault used in both a data loading technique and methodology which accommodates historical data, auditing, and tracking of data.
- Data governance functional and practical implementation and also responsible for designing common Data governance frameworks
- Extensively working on Data Modeling tools Erwin Data Modeler to design the data models.
- Designed ODS, and Data Vault with expertise in Loan and all types of Cards.
- Worked in enhancement of the existing Teradata processes running on the Data Warehouse
- Performed performance improvement of the existing Data warehouse applications to increase efficiency of the existing system.
- Worked extensively on Data Quality (running Data Profiling, Examine Profile outcome) Metadata management
- Performed Data Analysis, Data Validation and Data verification using Informatica DVO (Data Validation Option) from raw data to user acceptance.
- Created data masking mappings to mask the sensitive data between production and test environment.
- Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
- Applied data analysis, data mining and data engineering to present data clearly.
- Involved in administrative tasks, including creation of database objects such as database, tables, and views, using SQL, DDL, and DML requests.
- Involved in Teradata SQL Development, Unit testing and Performance Tuning and to ensure testing issues are resolved on the basis of using defect reports.
- Tested the ETL process for both before data validation and after data validation process. Tested the messages published by ETL tool and data loaded into various databases
- Worked in multiple issues raised by different users/ consumers of Data Warehouse and aided them in analyzing and modifying user queries to pull the reports
- Heavily worked on SQL query optimization also tuning and reviewing the performance metrics of the queries
- Created mapping specification spreadsheets to document the transformation logic
- Worked with Model Manager and multiple data marts, involving multiple subject areas simultaneously, domains and Naming Standards
- Developed and maintained the central repository, populated it with the metadata.
Environment: Erwin 7.3, Netezza, Oracle9i, SQL, PL/SQL, Taradata12, T-SQL, Metadata, SQL Server, SSIS, SSRS, MS Access, Excel, Flat Files, ODS, OLAP, OLTP, Crystal reports, Metadata.