Sr. Data Architect / Modeler Resume
Nyc, NY
SUMMARY:
- Experienced Data Architect, Data Analysis and Data modeling Professional with specialization in implementing End - To-End Business Intelligence and Data Warehousing solutions with over 9+ years of hands on experience.
- Strong experience in using Excel and MS Access to dump the data and analyze based on business needs.
- Excellent knowledge on Perl & UNIX.
- Expertise lies in Data Modeling, Database design and implementation of Oracle, AWS Redshift databases and Administration, Performance tuning etc.
- Experience in analyzing data using Hadoop Ecosystem including HDFS, Hive, Spark, Spark Streaming, Elastic Search, Kibana, Kafka, HBase, Zookeeper, PIG, Sqoop, Flume.
- Experienced working with Excel Pivot and VBA macros for various business scenarios.
- Data Transformation using Pig scripts in AWS EMR, AWS RDS.
- Experience working with data modeling tools like Erwin, Power Designer and ER Studio.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS.
- Experience in Architecture, Design and Development of large Enterprise Data Warehouse (EDW) and Data-marts for target user-base consumption.
- Experience providing solutions in Online Transactional and Data warehousing systems in Oracle projects using Oracle PLSQL, Unix Shell Scripting, ETL tools and Batch Design.
- Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export .
- Experience in data analysis using Hive, PigLatin, Impala.
- Well versed in Normalization / De-normalization techniques for optimum performance in relational and dimensional database environments.
- Good understanding of AWS, big data concepts and Hadoop ecosystem.
- Excellent in creating various artifacts for projects which include specification documents, data mapping and data analysis documents.
- An excellent team player& technically strong person who has capability to work with business users, project managers, team leads, architects and peers, thus maintaining healthy environment in the project.
- Experienced in various Teradata utilities like Fastload, Multiload, BTEQ, and Teradata SQL Assistant.
- Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server 2008 and Teradata.
- Develop and manage SQL, Python and R code bases for data cleansing and data analysis using Git version control
- Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and scheduling.
- Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
- Extensive ETL testing experience using Informatica 8.6.1/8.1 (Power Center/ Power Mart) (Designer, Workflow Manager, Workflow Monitor and Server Manager)
- Have good exposure on working in offshore/onsite model with ability to understand and/or create functional requirements working with client and also have Good experience in requirement analysis and generating test artifacts from requirements docs.
- Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server 2008 and Teradata.
- Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and scheduling.
- Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
- Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Excellent experience on Teradata SQL queries, Teradata Indexes, Utilities such as Mload, Tpump, Fast load and Fast Export.
TECHNICAL SKILLS:
Analysis and Modeling Tools: Erwin 9.6/9.5, Sybase Power Designer, Oracle Designer, ER/Studio 9.7.
Database Tools: Microsoft SQL Server 2014/2012 Teradata 15/14, Oracle 12c/11g, MS Access, Poster SQL, Netezza.
OLAP Tools: Tableau, SAP BO, SSAS, Business Objects, and Crystal Reports 9.
ETL Tools: SSIS, Pentaho, Informatica Power 9.6, SAP Business Objects XIR3.1/XIR2, Web Intelligence.
Cloud: AWS, Azure
Operating System: Windows, Dos, Unix.
Reporting Tools: Business Objects, Crystal Reports.
Tools: & Software’s: TOAD, MS Office, BTEQ, Teradata SQL Assistant.
Big Data Technologies: Hadoop, HDFS 2, Hive, Pig, HBase, Sqoop, Flume.
AWS: AWS, EC2, S3, SQS.
Other tools: TOAD, SQL PLUS, SQL LOADER, MS Project, MS Visio and MS Office, Have worked on C++, UNIX, PL/SQL etc.
PROFESSIONAL EXPERIENCE:
Confidential, NYC, NY
Sr. Data Architect / Modeler
Responsibilities:
- Working on logical and physical modeling, and ETL design for manufacturing data warehouse applications.
- Involved in creating Hive tables, and loading and analyzing data using hive queries Developed Hive queries to process the data and generate the data cubes for visualizing Implemented.
- Implemented Join optimizations in Pig using Skewed and Merge joins for large datasets schema.
- Defined and deployed monitoring, metrics, and logging systems on AWS.
- Selecting the appropriate AWS service based on data, compute, database, or security requirements.
- Designed and developed a Data Lake using Hadoop for processing raw and processed claims via Hive and Informatica.
- Worked extensively on Information Designer, Scheduled updates and advanced concepts like database write back from Spotfire reports.
- Deployed SSRS reports to Report Manager and created linked reports, snapshots, and subscriptions for the reports and worked on scheduling of the reports.
- Used SSRS to create reports, customized Reports, on-demand reports, ad-hoc reports and involved in analyzing multi-dimensional reports in SSRS.
- Generate comprehensive analytical reports by running SQL queries against current databases to conduct data analysis pertaining to various loan products.
- Created SQL codes from data models and interacted with DBA's to create development, testing and production database.
- Created data sources by using Spotfire Information Designer to connect various DataMart's with Oracle, REDSHIFT.
- Used Kibana plugin to visualize for elastic search.
- Work in team using ETL tool Informatica to populate the database, data transformation from the old database to the new database using Oracle and SQL Server.
- Focused on architecting NoSQL databases like Mongo, Cassandra and Cache database.
- Working in Regulatory Compliance IT team where worked as Data Architect role which involved Data Profiling, Data Modeling, ETL Architecture & Oracle DBA .
- Responsible for Big data initiatives and engagement including analysis, brainstorming, POC, and architecture.
- Designed the Logical Data Model using ERWIN 9.64 with the entities and attributes for each subject areas.
- Developed long term data warehouse roadmap and architectures, designs and builds the data warehouse framework per the roadmap.
- Data preparation & validation: Drawing flowcharts indicating the input data sets, sorting and Merging Techniques to get the required output and then writing the code using SAS/Base Include SAS/SQL, SAS/Macros. Data is validated before use for final analysis.
- Working on AWS and architecting a solution to load data create data models and run BI on it.
- Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
- Involved in building database Model, APIs and Views utilizing Python, in order to build an interactive web based solution.
- Developed and implemented different Pig UDFs to write ad-hoc and scheduled reports as required by the Business team.
- Designed and developed T-SQL stored procedures to extract, aggregate, transform, and insert data.
- Created and maintained SQL Server scheduled jobs, executing stored procedures for the purpose of extracting data from DB2 into SQL Server.
- Expertise in Data Manipulations using SAS data step, such as SAS Formats/In formats, Do-Loops, Macros and Merge procedures like PROC APPEND, PROC DATASETS, PROC SORT, and PROC TRANSPOSE.
- Developed SQL Stored procedures to query dimension and fact tables in data warehouse.
- Creating MongoDB data set backups using system-level file snapshot tool, such as LVM or native storage appliance tools.
- Point in time Backup and recovery in MongoDB using MMS. Data modeling for data from RDBMS to and MongoDB for optimal reads and writes.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
- Proficiency in SQL across a number of dialects (we commonly write MySQL, PostgreSQL, Redshift, SQL Server, and Oracle)
- Routinely deal in with large internal and vendor data and perform performance tuning, query optimizations and production support for SAS, Oracle 12c.
- Wrote Unix Shell Scripts for ETL and job automation
- Performed Hive programming for applications that were migrated to big data using Hadoop
- Creating dimensional data models based on hierarchical source data and implemented on Teradata achieving high performance without special tuning.
- Used elastic search for name pattern matching customizing to the requirement.
- Involved in designing Logical and Physical data models for different database applications using the Erwin.
- Data modeling, Design, implement, and deploy high-performance, custom applications at scale on Hadoop /Spark.
- Applied data analysis, data mining and data engineering to present data clearly.
- Reverse engineered some of the databases using Erwin.
- Working on AWS and architecting a solution to load data, create data models and run BI on it.
- Developed automated data pipelines from various external data sources (web pages, API etc) to internal Data Warehouse (SQL server, AWS), then export to reporting tools like Datorama by Python.
- Involved in loading data from LINUX file system to HDFS Importing and exporting data into HDFS and Hive using Sqoop Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Working on defining data architecture for data warehouses, Data marts and business applications.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
- Developed SAS programs for converting clinical data into SAS datasets using SQL Pass through facility and Lib name facility
- Specifies overall Data Architecture for all areas and domains of the enterprise, including Data Acquisition, ODS, MDM, Data Warehouse, Data Provisioning, ETL, and BI.
- Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture involving OLTP, ODS.
- Involved in Normalization / De normalization techniques for optimum performance in relational and dimensional database environments.
- Performance tuning and stress-testing of NoSQL database environments in order to ensure acceptable database performance in production mode.
- Involved in writing Liquid base scripts and generating SQL's.
- Implemented strong referential integrity and auditing by the use of triggers and SQL Scripts.
- Created and managed database objects (tables, views, indexes, etc.) per application specifications. Implemented database procedures, triggers and SQL scripts for development teams.
Environment: DB2, CA Erwin 9.6, Oracle 12c, MS-Office, SQL Architect, TOAD Benchmark Factory, SQL Loader, PL/SQL, SharePoint, ERwin r9.64, Talend, MS-Office, Redshift, SQL Server 2008/2012, Hive, Pig, Hadoop, Spark, AWS.
Confidential, Ewing Township NJ
Sr. Data Analyst / Modeler
Responsibilities:
- Performed database health checks and tuned the databases using Teradata Manager.
- Experience using MapReduce, and "Big data" work on Hadoop and other NOSQL platforms.
- Developed, managed and validated existing data models including logical and physical models of the Data Warehouse and source systems utilizing a 3NF model.
- Used Star schema and Snowflake Schema methodologies in building and designing the Logical Data Model into Dimensional Models.
- Developed Data Model -Conceptual/l Logical/ Physical DM for ODS & Dimensional delivery layer in Azure SQL Data Warehouse
- Implemented logical and physical relational database and maintained Database Objects in the data model using ER Studio.
- Performed as a Data Analysis, Data Modeling, Data Migration and data profiling using complex SQL on various sources systems including Oracle and Teradata.
- Managed Logical and Physical Data Models in ER Studio Repository based on the different subject area requests for integrated model. Developed Data Mapping, Data Governance, and transformation and cleansing rules involving OLTP, ODS.
- Enforced referential integrity in the OLTP data model for consistent relationship between tables and efficient database design.
- Developed different kind of reports such as Drill down, Drill through, Sub Reports, Charts, Matrix reports, Parameterized reports and Linked reports using SSRS.
- Created dimensional model for reporting system by identifying required dimensions and facts using Erwin r8.0.
- Extensively used ERwin for developing data model using star schema methodologies
- Collaborated with other data modeling team members to ensure design consistency and integrity.
- Involved in Planning, Defining and Designing data base using Erwin on business requirement and provided documentation.
- Validated the data of reports by writing SQL queries in PL/SQL Developer against ODS.
- Involved in user training sessions and assisting in UAT (User Acceptance Testing).
- Strong ability in developing advanced ANSI SQL queries to extract, manipulate, and/or calculate information to fulfill data and reporting requirements including identifying the tables and columns from which data is extracted.
- Define metadata business-level (logical) terms through interactions with project teams, business subject matter experts, and data analysis.
- Full life cycle of Data Lake, Data Warehouse with Big data technologies like Spark, Hadoop.
- Designed and deployed scalable, highly available, and fault tolerant systems on Azure.
- Establish and develop new database maintenance scripts -Automate Netezza database management and monitoring.
- Designed Star and Snowflake Data Models for Enterprise Data Warehouse using ER Studio.
- Utilized Informatica toolset (Informatica Data Explorer, and Informatica Data Quality) to analyze legacy data for data profiling.
- Used Kibana plugin to visualize for elastic search.
- Experience in deploying, managing and developing MongoDB clusters on Linux and Windows environment.
- Support for Linux/Unix systems administration, operational support and problem resolution for
server systems Created shared NFS files, mounting and un-mounting NFS server, NFS client on remote machine, sharing remote file folder, starting & Stopping the NFS services.
- Experience in using SAS and SQL to perform ETL from Oracle and Teradata databases and Create SAS datasets, SAS Macros, Proc/Data steps and SAS formats as required.
- Involved in debugging and Tuning the PL/SQL code, tuning queries, optimization for the Oracle, and DB2 database.
- Worked with data investigation, discovery and mapping tools to scan every single data record from many sources.
- Good understanding and hands on experience in setting up and maintaining NoSQL Databases like Cassandra and HBase.
- Performed POC's on NoSQL databases. Implemented NoSQL databases like Mongo, Cassandra in Dev/Test environment.
- Developed several behavioral reports and data points creating complex SQL queries and stored procedures using SSRS and Excel.
- Created several UNIX shell scripts to implement new features in the modules and update the existing ones
- Used PROC SORT, SET, UPDATE and MERGE statements for creating, updating and merging various SAS datasets.
- Generated periodic reports based on the statistical analysis of the data using SQL Server Reporting Services (SSRS).
- Generated reports using Global Variables, Expressions and Functions using SSRS.
Environment: PL/SQL, Business Objects XIR3, ER Studio, NoSQL, ETL Tools Informatica 8.6, Oracle 12c/11g, Azure, Teradata V2R14, Teradata SQL Assistant 12.0, ETL Tools
Confidential, East Hanover, NJ
Sr. Data Analyst /Modeler
Responsibilities:
- Worked RDS for implementing models and data on RDS.
- Developed mapping spreadsheets for (ETL) team with source to target data mapping with physical naming standards, data types, volumetric, domain definitions, and corporate meta-data definitions.
- Experienced in Using CA Erwin Data Modeler (Erwin) for Data Modeling (data requirements analysis, database design etc.) of custom developed information systems, including databases of transactional systems and data marts.
- Designing Star schema and Snow Flake Schema on Dimensions and Fact Tables
- Expertise in Informatica, DB2, Microstrategy and UNIX Shell scripting.
- Design MOLAP/ROLAP cubes on Teradata Database using SSAS.
- Used SQL for Querying the database in UNIX environment
- Creation of BTEQ, Fast export, Multi Load, TPump, Fast load scripts for extracting data from various production systems .
- Involved in Programming UNIX shell scripts (bash, sh, ksh, etc.)
- Strong experienced in developing HTML, PDF and RTF files using SAS/ODS.
- Wrote and running SQL, BI and other reports, analyzing data, creating metrics/dashboards/pivots/etc.
- Gather and analyze business data requirements and model these needs. In doing so, work closely with the users of the information, the application developers and architects, to ensure the information models are capable of meeting their needs.
- Working along with ETL team for documentation of transformation rules for data migration from OLTP to warehouse for purpose of reporting.
- Created views and extracted data from Teradata base tables and uploaded data to oracle staging server from Teradata tables, using fast export concept.
- Developed Unix shell programs and scripts to maximize productivity and resolve issue
- Programmed Perl security checking script and performed security and reliability of the application
- Designed new datasets with the modified values in tables using SAS statements, procedure and SAS functions
- Worked with Data Vault Methodology Developed normalized Logical and Physical database models.
- Transformed Logical Data Model to Physical Data Model ensuring the Primary Key and Foreign key relationships in PDM, Consistency of definitions of Data Attributes and Primary Index considerations.
- Designed and Developed logical & physical data models and Meta Data to support the requirements using Erwin
- Designed the ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and physical database (capacity planning, object creation and aggregation strategies) for Oracle and Teradata as per business requirements using Erwin
- Designed 3rd normal form target data model and mapped to logical model.
- Involved in extensive DATA validation using ANSI SQL queries and back-end testing
- Generated DDL statements for the creation of new ER/studio objects like table, views, indexes, packages and stored procedures.
Environment:, SQL Server, Erwin9.1, Oracle, Informatica, RDS, Big Data, JDBC, NOSQL, Spark, Scala, Star schema, Snow Flake Schema, Python, MySQL, PostgreSQL .
Confidential, Houston, TX
Data Analyst / Modeler
Responsibilities:
- Developed Unix Shell scripts, Perl scripts and SQL control files to load data through SQL Loader & Oracle data pump.
- Involved using ETL tool Informatica to populate the database, data transformation from the old database to the new database using Oracle.
- Involved in the creation, maintenance of Data Warehouse and repositories containing Metadata.
- Wrote and executed unit, system, integration and UAT scripts in a Data Warehouse projects.
- Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data Warehouse, data mart reporting system in accordance with requirements.
- Excellent experience and knowledge on Data Warehouse concepts and dimensional data modelling using Ralph Kimball methodology.
- Developed separate test cases for ETL process (Inbound & Outbound) and reporting
- Attended and participated in information and requirements gathering sessions Translated business requirements into working logical and physical data models for Data Warehouse, Data marts and OLAP applications.
- Developed SAS programs for converting clinical data into SAS datasets using SQL Pass through facility and Lib name facility
- Performed extensive Data Analysis and Data Validation on Teradata.
- Designed Star and Snowflake Data Models for Enterprise Data Warehouse using ERWIN
- Created and maintained Logical Data Model (LDM) for the project.Includes documentation of all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, glossary terms, etc.
- Experienced in Designed and Developed logical & physical data models and Meta Data to support the requirements using ERWIN.
Environment: Oracle, MS Visio, PL-SQL, Microsoft SQL Server 2000, Rational Rose, Data warehouse, OLTP, OLAP, ERWIN, Informatica 9.x, Windows, SQL, PL/SQL, SQL Server, Talend Data Quality, Talend Integration Suite 4x, Oracle 9i/10g, Flat Files, Windows, SVN.