Sr. Data Architect/ Data Modeler Resume
San Antonio, TX
SUMMARY
- 9+ years of IT industry experience in Application Design, Development, and Data Management - Data Governance, Data Architecture, Data Modeling, Data Warehousing and BI, Data Integration, Meta-data, Reference Data and MDM.
- Experience as Architect UML models and leverage the advanced executable code generators to target different domains.
- 3+ Years IBM DataStage Coding Experience
- 2+ Years Unit and Functional Testing and Debugging.
- Strong Experience in Big Data HadoopEcosystem in ingestion, storage, querying, processing and analysis of big data.
- Experience in Dimensional Data Modeling, Star/Snowflake schema, FACT & Dimension tables.
- Experience in Designing, Developing, Documenting, Testing of ETL jobs and mappings in Server and Parallel jobs using Data Stage to populate tablesin Data Warehouse and Data marts.
- Expertise in UNIX shell scripts using K-shell for the automation of processes and scheduling the Data Stage.
- Knowledge of programming languages like Java&Python.
- Experience with emerging technologies such Big Data, Hadoop, and NoSQL.
- Strong experience in analyzing/ Data Transformation of large amounts of data sets writing Pig scripts and Hive, AWS EMR, AWS RDS. Extensive knowledge in Hadoop stack components viz. Apache Hive, Pig Scripting, etc.
- Experience in analyzing data using Hadoop Ecosystem including HDFS, Hive, Spark, Spark Streaming, Elastic Search, Kibana, Kafka, HBase, Zookeeper, PIG, Sqoop, and Flume.
- Hands on experience in Normalization (1NF, 2NF, 3NF and BCNF) Denormalization techniques for effective and optimum performance in OLTP and OLAP environments.
- Experience in cloud development architecture on Amazon AWS, EC2, EC3, Elastic Search, Redshift and Basic on Azure.
- Experience in BI/DW solution (ETL, OLAP, Data mart), Informatica, BI Reporting tool like Tableau and QlikView and also experienced leading the team of application, ETL, BI developers, Testing team.
- Experience with Agile Extreme Programming (XP) development and Scrum lifecyclepractices, or a strong desire to learn including: pair programming, test drivendevelopment, continuous integration, iterative delivery, retrospection
- Good experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS), Cognosand Business Objects.
- Knowledge of transform data, using data mapping and data processing in Apache Beam.
- Proficient in UML Modeling like Use Case Diagrams, Activity Diagrams, and Sequence Diagrams with Rational Rose and MS Visio.
- Experienced in Technical consulting and end-to-end delivery with architecture, data modeling, data governance and design - development - implementation of solutions.
- Solid knowledge of Data Marts, Operational Data Store (ODS), OLAP, Dimensional Data Modeling with Ralph Kimball Methodology (Star Schema Modeling, Snow-Flake Modeling for FACT and Dimensions Tables) using Analysis Services.
- Solid hands on experience in creating and implementation of Conceptual, Logical and Physical Models for Online Transaction Processing and Online Analytical Processing. Efficient in developing Logical and Physical Data model and organizing data as per the business requirements using Erwin, ER Studio in both OLTP and OLAP applications.
- Worked on data modeling using ERWIN tool to build logical and physical models.
- Skillful in Data Analysis using SQL on Oracle, MS SQL Server, DB2 & Teradata.
- Extensive experience in development of T-SQL, Oracle PL/SQL Scripts, Stored Procedures and Triggers for business logic implementation.
- Decode the Teradata and SQL queries to find all the data attributes involved and document for the purpose of development.
- Excellent understanding of Hub Architecture Style for MDM hubs the registry, repository and hybrid approach.
- Mapping the Risk Data elements to the Authoritative Data Source and documenting the Schema, Database, Table details for data modelling purpose.
- Good exposure on usage of NoSQL database.
- Experienced in understanding the ETL framework metadata to understand the current state ETL implementation.
TECHNICAL SKILLS
Data Modeling Tools: Erwin R6/R9, Rational System Architect, IBM Info sphere Data Architect, ER Studio and Oracle Designer .
Database Tools: Microsoft SQL Server12.0, Teradata 15.0, Oracle 12c/11G/9i, and MS Access
BI Tools: Tableau 7.0/8.2, Tableau server 8.2, Tableau Reader 8.1, SAP Business Objects, Crystal Reports
Big Data: PIG, Hive, HBase, Spark, Sqoop, Flume.
Cloud Platforms: AWS EMR, AWS RDS, EC2, S3, Azure.
Packages: Microsoft Office 2010, Microsoft Project 2010, SAP and Microsoft Visio, Share point.
Operating Systems: Windows, Centos, Sun Solaris, UNIX, Ubuntu Linux
Version Tool: VSS, SVN, CVS, SAP BO 4.1
Tools: & Utilities: TOAD 9.6, Microsoft Visio 2010.
Methodologies: RAD, JAD, RUP, UML, System Development Life Cycle (SDLC), Waterfall Model.
PROFESSIONAL EXPERIENCE
Confidential, San Antonio, TX
Sr. Data Architect/ Data Modeler
Responsibilities:
- Owned and managed all changes to the data models. Created data models, solution designs and data architecture documentation for complex information systems.
- Architected, researched, evaluated and deployed new tools, frameworks, and patterns to build sustainable Big Data platforms for our clients.
- Designed the Logical Data Model using ERWIN 9.64 with the entities and attributes for each subject areas.
- Used Tableau for BI Reporting and Data Analysis.
- Working as a Sr. Data Architect/Modeler to generate Data Models using Erwin r9.64 and developed relational database system.
- Design and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Datatechnologies.
- Architected solutions using MS Azure PaaS services such as SQL Server, HDInsight, service bus, etc
- Implemented CI/CD based application development methodology using tools like Jenkins/TFS/powershelletc
- Provided technical oversight and guidance during clients engagement execution
- ProvidedCloud / Azure thought leadership through regular publications and speaking engagements
- Develop and maintain data architecture, including master data and data quality, using Toad Data Modeler and Microsoft Master Data Manager (MDS) as well as Oracle Data Integrator.
- Configured Hunk to read customer transaction data from Hadoop Ecosystems such as HDFS and Hive.
- Used DataStageas an ETL tool to extract data from sources systems, loaded the data into the ORACLE database.
- Designed and Developed Data stage Jobs to Extract data from heterogeneous sources, Applied transform logics to extracted data and Loaded into Data Warehouse Databases.
- Created Datastage jobs using different stages like Transformer, Aggregator, Sort, Join, Merge, Lookup, Data Set, Funnel, Remove Duplicates, Copy, Modify, Filter, Change Data Capture, Change Apply, Sample, Surrogate Key, Column Generator, Row Generator, Etc.
- Designed facts and dimension tables and defined relationship between facts and dimensions with Star Schema and Snowflake Schema in SSAS.
- Used Flume extensively in gathering and moving log data files from Application Servers to a central location in Hadoop Distributed File System (HDFS) for data science.
- Involved in Normalization / Denormalization techniques for optimum performance in relational and dimensional database environments.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Working with project management, business teams and departments to assess and refine requirements to design/develop BI solutions using Azure.
- Created user-friendly and a dynamically rendered custom dashboard to visualize the output data using the Pentaho CDE and CDF.
- Created various types of chart reports in Pentaho Business Analytics having Pie Charts, 3D Pie Charts, and Line Charts, Bar Charts, Stacked Bar Charts and Percentage Bar charts.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
- Guide Teams (onsite and offshore) in creating Unit and Integration tests. Setup deployment plans and dependency management using Maven. Setup Jenkins Continous Integration Jobs for automated deployments to integration servers.
- Collected large amounts of log data using Apache Flume and aggregating using PIG/HIVE in HDFS for further analysis.
- Created Logical and Physical Data Model using IBM Data Architect tool.
- Specifies overall Data Architecture for all areas and domains of the enterprise, including Data Acquisition, ODS, MDM, Data Warehouse, Data Provisioning, ETL and BI.
- Loaded data into Hive Tables from Hadoop Distributed File System (HDFS) to provide SQL-like access on Hadoopdata.
- Designing Star Denormalized tables on Azure
- Designing of Big Data platform technology architecture. The scope includes data intake, data staging, data warehousing, and high performance analytics environment.
- Utilize U-SQL for data analytics/data ingestion of raw data in Azure and Blob storage
- Developed and implemented data cleansing, data security, data profiling and data monitoring processes.
- Participated in OLAP model based on Dimension and FACTS for efficient loads ofdatabased on Star Schema structure on levels of reports using multi-dimensional models such as Star Schemas and Snowflake Schema.
- Developed Map Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
- Responsible for Dimensional Data Modeling and Modeling Diagrams using ERWIN.
- Applied data analysis, data mining and data engineering to present data clearly.
- Converting existing hive queries to Spark SQL queries to reduce execution time.
- Demonstrated expertise utilizing ETL tools, including SQL Server Integration Services (SSIS), Data Transformation Services (DTS), and Data Stage and ETL package design, and RDBM systems like SQL Servers, Oracle, and DB2.
- Review and Patch of Netezza and Oracle environments including DB2, OS and Server firmware.
- Extensively used Crystal Reports SAP SE 14.2 for Data Reporting.
- Gathered and analyzed existing physical data models for in scope applications and proposed the changes to the data models according to the requirements.
- Developed and implemented data cleansing, data security, data profiling and data monitoring processes.
- Used TeradataAdministrator and Teradata Manager Tools for monitoring and control the system.
- Developed and configured on Informatica MDM hub supports the Master Data Management (MDM), Business Intelligence (BI) and Data Warehousing platforms to meet business needs.
- Developed PL/SQL scripts to validate and load data into interface tables
- Participated in maintaining data integrity between Oracle and SQL databases.
- Participated in OLAP model based on Dimension and FACTS for efficient loads of data based on Star Schema structure on levels of reports using multi-dimensional models such as Star Schemas and Snowflake Schema.
Environment: Oracle 12c, MS-Office, SQL Architect, Spark, TOAD Benchmark Factory, Teradatav15, Hadoop, SQL Loader, SharePoint, ERwin r 9.64, DB2, MS-Office, SQL Server 2008/2012, Azure, HBase, Hive.
Confidential, Hunt Valley, MD
Sr. Data Modeler
Responsibilities:
- Involve in Data Architect role to review business requirement and compose source to target data mapping documents.
- Designed and build relational database models and defines data requirements to meet the business requirements.
- Worked with Data Steward Team for designing, documenting and configuring Informatica Data Director for supporting management of MDM data.
- Actively involved in the Design and development of the Star schema data model.
- Implemented slowly changing and rapidly changing dimension methodologies; created aggregate fact tables for the creation of ad-hoc reports.
- Created and maintained surrogate keys on the master tables to handle SCD type 2 changes effectively.
- Developed Star and Snowflake schemas based dimensional model to develop the data warehouse.
- Setup automatic code review, testing and deployment pipelines using Jenkins and bitBucket for continuous integration and deployment (CICD).
- Worked with the developers in deciding the application architecture.
- Designed and implemented a Data Lake to consolidate data from multiple sources, using Hadoop stack technologies like SQOOP, HIVE/HQL.
- Written complex SQL queries for validating the data against different kinds of reports generated by Business Objects XIR2.
- Designing Logical data models and Physical Data Models using ER Studio.
- Designed semantic layer data model. Conducted performance optimization for BI infrastructure.
- Used the DataStageDesigner to develop processes for extracting, cleansing, transforming, integrating and loading data into staging tables.
- Worked with Metadata Definitions, Import and Export of Datastage jobs using Data stage Manager.
- Created source table definitions in the DataStage Repository.
- Involved in the creation, maintenance of Data Warehouse and repositories containing Metadata.
- Performed Hive programming for applications that were migrated to big data using Hadoop.
- As an Architect implement MDM hub to provide clean, consistent data for a SOA implementation.
- Installing and configuring the a 3-node Cluster in AWS EC2 LinuxServers.
- Designed different type of STAR schemas like detailed data marts and Plan data marts, Monthly Summary data marts using ER studio with various Dimensions Like Time, Services, Customers and various FACT Tables.
- Developed and maintained data dictionary to create metadata reports for technical and business purpose.
- ETL processing using Pig & Hive in AWS EMR, S3
- Implemented Data Vault Modeling Concept solved the problem of dealing with change in the environment by separating the business keys and the associations between those business keys, from the descriptive attributes of those keys using HUB, LINKS tables and Satellites.
- Extensive Data validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Data Profiling, Mapping and Integration from multiple sources to AWS S3.
- Created single value as well as multi-value drop down and list type of parameters with cascading prompt in the reports.
- Integrating Kettle (ETL) with Hadoop, Pig, Hive, Spark, Storm, HBase, Kafka and other Big Data component for various functionalities and other various NoSQLdata stores can be found in the Pentaho Big Data Plugin.
- Design and development of ETL routines to extract data from heterogeneous sources and loading to Actuarial Data Warehouse.
- Participated in preparing Logical Data Models/Physical Data Models.
- Identify source systems, their connectivity, related tables and fields and ensure data suitably for mapping.
- Designed a Data Vault for Deal transactions for POC using Snowflake.
- Worked with BTEQ to submit SQL statements, import and export data, and generate reports in Teradata.
- Worked on HL7 2.x file format (ADT and clinical messages) on MEDIFAX and a thorough understanding of how interface development projects work.
- Developed company-wide data standards, data policies and data warehouse/business intelligence architectures.
- Designed and documented Use Cases, Activity Diagrams, Sequence Diagrams, OOD (Object Oriented Design) using UML and Visio.
- Performed data cleaning and data manipulation activities using NOSQL utility.
- Designed and Developed Oracle PL/SQL Procedures and UNIXShell Scripts for Data Import/Export and Data Conversions.
- Developing the Conceptual Data Models, Logical data models and transformed them to creating schema using ER Studio.
Environment: DB2, ER Studio, Oracle 11g, MS-Office, SQL Architect, Hadoop, Hive, Pig, TOAD Benchmark Factory, Sqoop, SQL Loader, AWS S3, PL/SQL, SharePoint, MS-Office, SQL Server 2014
Confidential, Atlanta, GA
Data Modeler/Data Analyst
Responsibilities:
- Analyzed the business requirements by dividing them into subject areas and understood the data flow within the organization
- Database Design (Conceptual, Logical and Physical) for OLTP and OLAP systems.
- Created and developed Slowly Changing Dimensions tables SCD2, SCD3 to facilitate maintenance of history.
- Created documents for technical & business user requirements during requirements gathering sessions.
- Turned SQL queries to make use of data base indexes, and analyzed the data base objects.
- Created Logical and Physical EDW models and data marts.
- Experienced in data migration and cleansing rules for the integrated architecture (OLTP, ODS, DW).
- Managed all indexing, debugging and query optimization techniques for performance tuning using T-SQL.
- Developed the logical and physical model from the conceptual model developed using a tool Erwin by understanding and analyzing business requirements.
- Handled data loading operations from flat files to tables using NZLOAD utility.
- Experienced in data cleansing for accurate reporting. Thoroughly analyzed the data and integrated different data sources to process matching functions.
- ProvidedAzure technical expertise including strategic design and architectural mentorship, assessments, POCs, etc., in support of the overall sales lifecycle or consulting engagement process
- Implement solutions using Azure PaaS features like web jobs, cloud services, Azure SQL Server, service bus, notification hubs etc Configure large database solutions in Azure using SQL Server or Oracle database solutions
- Applied data naming standards, created the data dictionary and documented data model translation decisions and also maintained DW metadata.
- Created DDL scripts for implementing Data Modeling changes. Created ERWIN reports in HTML, RTF format depending upon the requirement, Published Data model in model mart, created naming convention files, co-coordinated with DBAs' to apply the data model changes.
- Extensively used Normalization techniques (up to 3NF).
- Writing complex queries using Teradata SQL.
- Worked with the ETL team to document the transformation rules for data migration from source to target systems.
- Developed source to target mapping documents to support ETL design.
Environment: ERWIN r7.2, PL/SQL, MS SQL, MS Visio, Business Objects, Windows NT, Linux, Sybase Power Designer, Oracle 8i, SQL Server, Windows, MS Excel, Informatica.