Sr. Data Architect/analyst/modeler Resume
Reston, VA
SUMMARY:
- Over 9 year of Senior Data Architect/Modeler/Analyst with IT professional experienced in Data Analysis, Data Modeling, Data Architecture, designing, developing, and implementing data models for enterprise - level applications and systems
- Experienced in integration of various relational and non-relational sources such as DB2, Teradata, Oracle, Netezza, SQL Server, NoSQL, COBOL, XML and Flat Files, to Netezza database.
- Experience in providing solutions within the Hadoop environment using technologies such as HDFS, MapReduce, Pig, Hive, HBase, ZooKeeper, Storm, and other Big Data technologies
- Experienced in designing Star Schema, Snowflake schema for Data Warehouse, by using tools like Erwin data modeler, Power Designer and Embarcadero E-R Studio.
- Experienced in big data analysis and developing data models using Hive, PIG, and Map reduce, SQL with strong data architecting skills designing data-centric solutions.
- Experienced in Data modeling for Data Mart/Data Warehouse development including conceptual, logical and physical model design, developing Entity Relationship Diagram (ERD), reverse/forward engineer (ERD) with CA ERwin data modeler.
- Experienced in Netezza tools and Utilities NzLoad, NzSql, NzPL/SQL, Sqltoolkits, Analytical functions etc.
- Extensive experience in Relational and Dimensional Data modeling for creating Logical and Physical Design of Database and ER Diagrams using multiple data modeling tools like Erwin, ER Studio.
- Very good experience and knowledge on Amazon Web Services: AWS Redshift, AWS S3 and AWS EMR.
- Experienced in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Experienced in development and support knowledge on Oracle, SQL, PL/SQL,T-SQL queries
- Experienced in Logical Data Model (LDM) and Physical Data Models (PDM) using Erwin, ER Studio and Power Designer data modeling tool.
- Experienced in migration of Data from Excel, Flat file, Oracle to MS SQL Server by using SQL Server SSIS.
- Strong experience in Normalization (1NF, 2NF, 3NF and BCNF) and De-normalization techniques for effective and optimum performance in OLTP and OLAP environments and experience with Kimball Methodology and Data Vault Modeling
- Experienced in ETL design, development and maintenance using Oracle SQL, PL/SQL, TOAD SQL Loader, and Database Management System (RDBMS).
- Experienced in Designed and developed Data models for Database (OLTP), the Operational Data Store (ODS), Data warehouse (OLAP), and federated databases to support client enterprise Information Management Strategy and excellent Knowledge of Ralph Kimball and BillInmon's approaches to Data Warehousing.
- Expertise in SQL Server Analysis Services (SSAS), SSIS and SQL Server Reporting
Services (SSRS)
- Experienced in Transform, and Load data from heterogeneous data sources to SQL Server using SQL Server Integration Services (SSIS) Packages.
- Good knowledge and experience in Developing Informatica Mappings, Mapplets, Sessionss, Workflows and Worklets for data loads from various sources such as Oracle, Flat Files, DB2, SQL Server etc.
- Excellent understanding and working experience of industry standard methodologies like System Development Life Cycle (SDLC), as per Rational Unified Process (RUP), AGILE Methodologies.
TECHNICAL SKILLS:
Data Modeling Tools: Erwin R6/R9, Rational System Architect, IBM Infosphere Data Architect, ER Studio and Oracle Designer.
ETL/Data warehouse Tools: Informatica 9.6/9.1/8.6.1/8.1 , SAP Business Objects XIR3.1/XIR2, Web Intelligence, Talend, Tableau, Pentaho
Database Tools: Microsoft SQL Server12.0, Teradata 15.0, Oracle 12c/11g/10g and MS Access.
Big Data Technologies: Pig, Hive, Spark, Scala, Sqoop, MongoDB, Cassandra, HBase, Kafka.
BI Tools: Tableau 7.0/8.2/10.x, Tableau server 8.2, Tableau Reader 8.1, SAP Business Objects, Crystal Reports Packages: Microsoft Office 2010, Microsoft Project 2010, SAP and Microsoft Visio, Share point Portal Server
Cloud Platforms: AWS, Azure, AWS RDS, AWS S3 and AWS EMR
Tools: OBIE 10g/11g/12c, SAP ECC6 EHP5, Go to meeting, Docusign, Insidesales.com, Share point, Mat-lab.
Operating System: Windows, Unix, Sun Solaris
RDBMS: Microsoft SQL Server14.0, Teradata 15.0, Oracle 12c/11g/10g/9i, and MS Access
Version Tool: GIT, SVN
Project Execution Methodologies: Agile, Ralph Kimball and BillInmon data warehousing methodology, Rational Unified Process (RUP), Rapid Application Development (RAD), Joint Application Development (JAD)
PROFESSIONAL EXPERIENCE:
Sr. Data Architect/Analyst/Modeler
Confidential, Reston VA
Responsibilities:
- Involved in the Data Architecture and in developing the overall ETL architecture, including solution architecture as needed.
- Worked in Regulatory Compliance IT team where worked as Data Architect role which involved Data Profiling, Data Modeling, ETL Architecture & Oracle DBA
- Responsible for Big data initiatives and engagement including analysis, brainstorming, POC, and architecture and worked with Big Data and Big Data on Cloud, Master Data Management and Data Governance.
- Designed the Logical Data Model using ERWIN 9.64 with the entities and attributes for each subject areas.
- Developed long term data warehouse roadmap and architectures, designs and builds the data warehouse framework per the roadmap.
- Working on Cloud computing using Microsoft Azure with various BI Technologies and exploring NoSQL options for current back using Azure Cosmos DB (SQL API)
- Involved in creating Hive tables, and loading and analyzing data using hive queries Developed Hive queries to process the data and generate the data cubes for visualizing Implemented.
- Designed and developed a Data Lake using Hadoop for processing raw and processed claims via Hive and Informatica.
- Developed and implemented different Pig UDFs to write ad-hoc and scheduled reports as required by the Business team and implemented Join optimizations in Pig using Skewed and Merge joins for large datasets schema.
- Design of Big Data platform technology architecture. The scope includes data intake, data staging, data warehousing, and high performance analytics environment.
- Used Polybase for ETL/ELT process with Azure Data Warehouse to keep data in Blob Storage with almost no limitation on data volume.
- Created the template SSIS package that will replicate about 200 processes to load the data using Azure SQL.
- Data modeling, Design, implement, and deploy high-performance, custom applications at scale on Hadoop /Spark and implemented Data Integrity and Data Quality checks in Hadoop using Hive and Linux scripts.
- Involved in loading data from LINUX file system to HDFS Importing and exporting data into HDFS and Hive using Sqoop Implemented Partitioning, Dynamic Partitions, and Buckets in Hive.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
- Specifies overall Data Architecture for all areas and domains of the enterprise, including Data Acquisition, ODS, MDM, Data Warehouse, Data Provisioning, ETL, and BI.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Involved in Normalization / De normalization techniques for optimum performance in relational and dimensional database environments.
- Implemented Spark solution to enable real time reports from Cassandra data and implemented strong referential integrity and auditing by the use of triggers and SQL Scripts.
- Designed and developed T-SQL stored procedures to extract, aggregate, transform, and insert data and developed SQL Stored procedures to query dimension and fact tables in data warehouse.
- Created and maintained SQL Server scheduled jobs, executing stored procedures for the purpose of extracting data from DB2 into SQL Server.
- Experience with SQL Server Reporting Services (SSRS) to author, manage, and deliver both paper-based and interactive Web-based reports
- Performed migration and merging of RPD's in OBIEE and performed Hive programming for applications that were migrated to big data using Hadoop and focused on architecting NoSQL databases like Mongo, Cassandra and Cache database.
- Deployed SSRS reports to Report Manager and created linked reports, snapshots, and subscriptions for the reports and worked on scheduling of the reports.
- Performed ad hoc data mining to fulfill management requests and to modify existing reports.
- Created External and Managed tables in Hive and used them appropriately for different PIG scripts required for reporting.
- Developed, and scheduled variety of reports like cross-tab, parameterized, drill through and sub reports with SSRS.
- Perform routine management operations, including configuration and performance analysis for mongodb and diagnosing Performance Issues for mongodb.
- Managed multiple ETL development teams for business intelligence and Master data management initiatives.
- Point in time Backup and recovery in MongoDB using MMS and data modeling for data from RDBMS to and MongoDB for optimal reads and writes
- Involved in designing Logical and Physical data models for different database applications using the Erwin.
- Reverse engineered some of the databases using Erwin and proficiency in SQL across a number of dialects (we commonly write MySQL, PostgreSQL, SQL Server, and Oracle).
- Coordinating with Client and Business Analyst to understand and develop OBIEE reports.
Environment: DB2, CA Erwin 9.6/7, Oracle 12c, MS-Office, SQL Architect, TOAD Benchmark Factory, Data Mining, SQL Loader, PL/SQL, SharePoint, ERwin r9.64, Talend, MS-Office, Redshift, SQL Server 2008/2012, Hive, Pig, Hadoop, Spark, Azure.
Sr. Data Architect/Modeler
Confidential, Minneapolis MN
Responsibilities:
- Solutions architect for transforming business problems into Big Data and Data Science solutions and define Big Data strategy and Road map.
- Gathered business requirements, working closely with business users, project leaders and developers. Analyzed the business requirements and designed conceptual and logical data models.
- Lead the strategy, architecture and process improvements for data architecture and data management, balancing long and short-term needs of the business.
- Did exploratory data analysis (EDA) using Python and done Python integration with Hadoop Map Reduce and spark.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies and building relationships and trust with key stakeholders to support program delivery and adoption of enterprise architecture.
- Providing technical leadership, mentoring throughout the project life-cycle, developing vision, strategy, architecture and overall design for assigned domain and for solutions.
- Implemented end-to-end systems for Data Analytics, Data Automation and integrated with custom visualization tools using R, Hadoop and MongoDB, Cassandra.
- Involved in creating Data Lake by extracting customer's Big Data from various data sources into Hadoop HDFS. This included data from Excel, Flat Files, Oracle, SQL Server, MongoDb, Cassandra, HBase, Teradata, Netezza and also log data from servers
- Developed automated data pipelines from various external data sources (web pages, API etc) to internal data warehouse (SQL server) then export to reporting tools by Python.
- Designed and implemented system architecture for Amazon EC2 based cloud-hosted solution for client.
- Worked on Dimensional and Relational Data Modeling using Star and Snowflake Schemas, OLTP/OLAP system, Fact and Dimension tables, Conceptual, Logical and Physical data modeling using Erwin r9.6.
- Proficiency in SQL across a number of dialects (we commonly write MySQL, PostgreSQL, Redshift, SQL Server, and Oracle)
- Designed the schema, configured and deployed AWS Redshift for optimal storage and fast retrieval of data and performed PoC for Big data solution using Cloudera Hadoop for data loading and data querying
- Involved in T-SQL queries and optimizing the queries in Oracle 12c, SQL Server 2014, DB2, and Netezza, Teradata and involved in Normalization and De-Normalization of existing tables for faster query retrieval.
- Created MDM, OLAP data architecture, analytical data marts, and cubes optimized for reporting and involved in Logical modeling using the Dimensional Modeling techniques such as Star Schema and Snow Flake Schema.
- Worked on Hadoop ecosystem, hive queries, MongoDB, Cassandra, Pig, and Apache Strom and wrote Pig Scripts to generate MapReduce jobs and performed ETL procedures on the data in HDFS.
- Developed LINUX Shell scripts by using NZSQL/NZLOAD utilities to load data from flat files to Netezza database.
- Created SSIS Reusable Packages to extract data from Multi formatted Flat files, Excel, XML files into UL Database and DB2 Billing Systems.
- Performed data analysis, statistical analysis, generated reports, listings and graphs using SAS tools, SAS Integration Studio, SAS/Graph, SAS/SQL, SAS/Connect and SAS/Access.
- Worked in importing and cleansing of data from various sources like Teradata 15, Oracle, flat files, SQL Server with high volume data.
- Involved in creating informatica mapping to populate staging tables and data warehouse tables from various sources like flat files DB2, Netezza and oracle sources.
- Full life cycle of Data Lake, Data Warehouse with Big data technologies like Spark, Hadoop, Cassandra.
- Developed Data Mapping, Data profiling, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to Confidential tables.
- Extensively Used Sqoop to import/export data between RDBMS and hive tables, incremental imports and created Sqoop jobs for last saved value
- Managed definition and execution of data mapping, conversion and reconciliation processes, for data originating from a plethora of enterprise and SAP, leading into to ongoing data governance organization design.
- Developed Tableau visualizations and dashboards using Tableau Desktop, Tableau workbooks from multiple data sources using Data Blending.
- Performed data analysis and data profiling using complex SQL queries on various sources systems including Oracle, Teradata and Netezza.
Environment: Erwinr9.6, Oracle 12c, Teradata15, Netezza, PL/SQL, T-SQL, MDM, Python, Data Stage, DB2, SQL Sever2014, Informatica Power Center, SQL, Hadoop, Hive Queries, MongoDB, SAS, Spark, SSRS, SSIS, SSAS, Tableau Excel, MS Access, SAP etc, AWS, R, AWS Redshift, Sqoop, MongoDB, HBase.
Sr. Data Architect/Modeler
Confidential, Pittsburgh PA
Responsibilities:
- Participated in the design, development, and support of the corporate operation data store and enterprise data warehouse database environment.
- Documented a whole process of working with Tableau Desktop, installing Tableau Server and evaluating Business Requirements.
- Participated in big data architecture for both batch and real-time analytics and mapped data using scoring system over large data on HDFS
- Implemented dimension model (logical and physical data modeling) in the existing architecture using ER/Studio.
- Involved in Database using Oracle, XML, DB2, Teradata 14.1, Netezza, SQL server, Big Data and NoSQL based MongoDB and Cassandra.
- Developed, managed and validated existing Data Models including Logical and Physical Models of the Data Warehouse and source systems utilizing a 3NFmodel.
- Worked on predictive and what-if analysis using Python from HDFS and successfully loaded files to HDFS from Teradata, and loaded from HDFS to HIVE.
- Gathering the business requirements from customers and creating data models for different branches using MS access and ER/Studio.
- Designed and deployed scalable, highly available, and fault tolerant systems on AWS.
- Worked on NoSQL databases including HBase, Mongo DB, and Cassandra. Implemented multi-datacenter and multi-rack Cassandra cluster.
- Designed Source to Confidential mapping from primarily Flat files, SQL Server, Oracle 11g, Netezza using Informatica Power Center9.6.
- Cleansed the data by eliminating duplicate and inaccurate data in Python and used Python scripts to update the content in database and manipulate files.
- Executed ad-hoc data analysis for customer insights using SQL using Amazon AWS Hadoop Cluster. Worked on Normalization and De-Normalization techniques for both OLTP and OLAP systems.
- Designed Logical data model and Physical Conceptual data documents between source systems and the Confidential data warehouse.
- Worked with Hadoop ecosystem covering HDFS, HBase, YARN and Map Reduce.
- Extensively used ER/Studio for developing data model using star schema methodologies.
- Worked on Data Modeling using Dimensional Data Modeling, Star Schema/Snow Flake schema, and Fact & Dimensional, Physical & Logical data modeling.
- Used External Loaders like Multi Load, T Pump and Fast Load to load data into Teradata Database analysis, development, testing, implementation and deployment.
- Built database Model, Views and API's using Python for interactive web based solutions.
- Developed requirements, perform data collection, cleansing, transformation, and loading to populate facts and dimensions for data warehouse
- Created mapreduce running over HDFS for data mining and analysis using R and Loading & Storage data to Pig Script and R for MapReduce operations.
- Designed the schema, configured and deployed AWS Redshift for optimal storage and fast retrieval of data.
- Created, managed, and modified logical and physical data models using a variety of data modeling philosophies and techniques including Inmon or Kimball
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
- Managed the Master Data Governance queue including assessment of downstream impacts to avoid failures
- Worked in the capacity of ETL Developer (Oracle Data Integrator (ODI) / PL/SQL) to migrate data from different sources in to Confidential Oracle Data Warehouse.
- Created and Configured Workflows, Work lets, and Sessions to transport the data to Confidential warehouse Netezza tables using Informatica Workflow Manager.
- Involved in cleaning of large data sets using python and Created named sets, calculated member and designed scope in SSAS, SSIS, SSRS.
- Worked on Teradata SQL queries, Teradata Indexes, MDM Utilities such as Mload, Tpump, Fast load and Fast Export.
- Migrated SQL server 2008 to SQL Server 2014 in Microsoft Windows Server 2003 and troubleshooting high availability scenarios involving Clustering, Database Mirroring, Log Shipping and Replication.
- Involved in extensive Data validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Involved in Troubleshooting and quality control of data transformations and loading during migration from Oracle systems into Netezza EDW.
- Created SSIS packages to load data from different sources such as Excel, Flat file, DB2 to SQL server Data warehouse and SQL Server, PL/SQL Transactional database.
Environment: ER/Studio, SSIS, SSRS, SAS, Netezza, Excel, MDM, PL/SQL, ETL, Python, Tableau, Hadoop, Hive, Pig, Mongo DB, Aginity, Teradata SQL Assistant, Cassandra, PL/SQL, T-SQL, Cognos, DB2, Oracle11g, SQL, Teradata14.1, Informatica Power Center9.6, AWS Redshift, HBase.
Sr. Data Modeler/Data Analyst
Confidential, Dewitt-NY
Responsibilities:
- Created conceptual, logical and physical relational models for integration and base layer; created logical and physical dimensional models for presentation layer and dim layer for a dimensional data warehouse in Power Desinger.
- Involved in reviewing business requirements and analyzing data sources form Excel/Oracle SQL Server for design, development, testing, and production rollover of reporting and analysis projects.
- Analyzing, designing, developing, implementing and maintaining ETL jobs using IBM Info sphere Data stage and Netezza.
- Extensively worked in Client-Server application development using Oracle 10g, Teradata 14, SQL, PL/SQL, Oracle Import and Export Utilities.
- Coordinated with DB2 on database build and table normalizations and de-normalizations.
- Conducted brain storming sessions with application developers and DBAs to discuss about various de-normalization, partitioning and indexing schemes for Physical Model.
- Involved in several facets of MDM implementations including Data Profiling, metadata acquisition and data migration.
- Extensively used SQL Loader to load data from the Legacy systems into Oracle databases using control files and used Oracle External Tables feature to read the data from flat files into Oracle staging tables.
- Involved in extensive Data validation by writing several complex SQL queries and Involved in back-end testing and worked with data quality issues.
- Used SSIS to create ETL packages to validate, extract, transform and load data to data warehouse databases, data mart databases, and process SSAS cubes to store data to OLAP databases
- Strong understanding of Data Modeling (Relational, dimensional, Star and Snowflake Schema), Data analysis, implementations of Data warehousing using Windows and UNIX.
- Extensively worked with Netezza database to implement data cleanup, performance tuning techniques.
- Created ETL packages using OLTP data sources (SQL Server 2008, Flat files, Excel source files, Oracle) and loaded the data into Confidential tables by performing different kinds of transformations using SSIS.
- Migrated SQL server 2008 to SQL Server 2008 R2 in Microsoft Windows Server 2008 R2 Enterprise Edition.
- Developing reusable objects like PL/SQL program units and libraries, database procedures and functions, database triggers to be used by the team and satisfying the business rules.
- Performed data validation on the flat files that were generated in UNIX environment using UNIX commands as necessary.
- Worked with NZ Load to load flat file data into Netezza, DB2 and Architect to identify proper distribution keys for Netezza tables.
Environment: Power Designer, Teradata14, Oracle10g, PL/SQL, MDM, SQL Server 2008, ETL, Netezza, DB2, SSIS, SSRS, SAS, SPSS, Datastage, Informatica, SQL, T-SQL, UNIX, Netezza, Aginity, SQL assistance etc.
Sr. Data Analyst/Modeler
Confidential
Responsibilities:
- Analyzed the physical data model to understand the relationship between existing tables. Cleansed the unwanted tables and columns as per the requirements as part of the duty being a Data Analyst.
- Established and maintained comprehensive data model documentation including detailed descriptions of business entities, attributes, and data relationships.
- Designed Star and Snowflake Data Models for Enterprise Data Warehouse using ER Studio.
- Worked on Metadata Repository (MRM) for maintaining the definitions and mapping rules up to mark
- Trained Spotfire tool and gave guidance in creating Spotfire Visualizations to couple of colleagues
- Created DDL scripts for implementing Data Modeling changes. Created ERWIN reports in HTML, RTF format depending upon the requirement, Published Data model in model mart, created naming convention files, co-coordinated with DBAs' to apply the data model changes.
- Developed Contracting Business Process Model Workflows (current / future state) using Bizagi Process Modeler software.
- Developed data Mart for the base data in Star Schema, Snow-Flake Schema involved in developing the data warehouse for the database.
- Worked on Unit Testing for three reports and created SQL Test Scripts for each report as required
- Extensively used ER Studio as the main tool for modeling along with Visio and worked on Unit Testing for three reports and created SQL Test Scripts for each report as required
- Configured & developed the triggers, workflows, validation rules & having hands on the deployment process from one sandbox to other.
- Managed Logical and Physical Data Models in ER Studio Repository based on the different subject area requests for integrated model.
- Created automatic field updates via workflows and triggers to satisfy internal compliance requirement of stamping certain data on a call during submission.
- Worked on Metadata Repository (MRM) for maintaining the definitions and mapping rules up to mark.
- Developed data Mart for the base data in Star Schema, Snow-Flake Schema involved in developing the data warehouse for the database.
- Developed enhancements to Mongo DB architecture to improve performance and scalability.
- Forward Engineering the Data models, Reverse Engineering on the existing Data Models and Updates the Data models.
- Performed data cleaning and data manipulation activities using NZSQL utility and analyzed and understood the architectural design of the project in a step by step process along with the data flow
Environment: Oracle SQL Developer, Oracle Data Modeler, Teradata14, SSIS, Business Objects, SQL Server 2008, ER/Studio Windows, MS Excel.
Data Analyst
Confidential
- Designed and created web applications to receive query string input from customers and facilitate entering the data into SQL Server databases.
- Performed thorough data analysis for the purpose of overhauling the database using SQL Server.
- Designed and implemented business intelligence to support sales and operations functions to increase customer satisfaction.
- Converted physical database models from logical models, to build/generate DDL scripts.
- Maintained warehouse metadata, naming standards and warehouse standards for future application development.
- Extensively used ETL to load data from DB2, Oracle databases.
- Involved with data profiling for multiple sources and answered complex business questions by providing data to business users.
- Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
- Worked with Business users during requirements gathering and prepared Conceptual, Logical and Physical Data Models.
- Wrote PL/SQL statement, stored procedures and Triggers in DB2 for extracting as well as writing data.
- Optimized the existing procedures and SQL statements for the better performance using EXPLAIN PLAN, HINTS, SQL TRACE and etc. to tune SQL queries.
- The interfaces were developed to be able to connect to multiple databases like SQL server and oracle.
- Assisted Kronos project team in SQL Server Reporting Services installation and developed SQL Server database to replace existing Access databases.
- Attended and participated in information and requirements gathering sessions
- Translated business requirements into working logical and physical data models for Data warehouse, Data marts and OLAP applications.
- Expertise and worked on Physical, logical and conceptual data model
- Designed both 3NF data models for ODS, OLTP systems and dimensional data models using star and snow flake Schemas
- Wrote and executed unit, system, integration and UAT scripts in a data warehouse projects.
- Extensively used ETL methodology for supporting data extraction, transformations and loading processing, in a complex EDW using Informatica.
- Worked and experienced on Star Schema, DB2 and IMS DB.
Environment: Oracle, PL/SQL, DB2, ERWIN, UNIX, Teradata SQL Assistant, Informatica, OLTP, OLAP, Data Marts, DQ analyzer.