Data Warehouse Resume Profile
Park, CA
SUMMARY:
Data Architect/Data Modeler Enterprise Data Warehouse/Integration Development/ ETL/BI/Business Data Analyst consultant expert with over 20 years of experience with an outstanding record of delivering complex technology and system integration projects on time and within budget. Hands on experience in coding, analysis, data modeling design, development, deployment, testing, enhancement and maintenance. Projects include data integration, data migration, metadata management, MDM, data architecture and data quality, data synchronization. Implemented large and complex enterprise data warehouses using star and snowflake schemas creating data access layer, data translation layer, operational data stores and data marts. Full system development life cycle included project planning, organization, management, implementation, infrastructure architecture and budgeting.
- Designed and reverse engineered enterprise systems using multi-dimensional Ralph Kimball' and Bill Inmon' approach , OLTP/OLAP systems, metadata modeling.
- Architected water fall model for data warehouse archive/replication at a point in time for data retention and faster querying for historical data.
- Domain expertise with FCC regulations Telecom , FDA regulations Healthcare and FAA regulations NASA Ames including SOX compliance.
- Handled large volume data warehouse to the tune of 18 terabytes on NAS storage systems with total volume of 40 terabytes growing monthly to the tune of 1 terabyte.
- Experienced in performance tuning at database level, design changes, adding SQL hints, using PL/SQL Hierarchical Profiler, splitting source data into multiple threads, parallel target loading, bit mapped indexes, database parameter changes and modifying SQL.
- Demonstrated ability to support multiple concurrent projects or initiatives and to deal with conflicting priorities.
- Demonstrated team building, relationship building and communication skills. Mentoring experience in a lead capacity.
- Infrastructure architecture and software setup for different environments in data centers.
- Presentations to senior management on architecture, team members on projects/technical training and business units on systems.
- Conducted in-house and client training on software applications and tools/software used.
- Designed/Re-architected projects amounting to company savings to tune of 1 Million dollars/year.
- Provided recommendations and/or best practices on design and development of the MDM sub-system, including hub design, and API services and data synchronization with non-MDM data stores.
- Used SHA-2 cryptographic hash function for security at NASA Data Center.
- Communicate effectively with developers, product managers, and senior management and technical/business and product teams inside and outside the organization.
- Designed and coded an ETL framework from scratch using bash with MySQL database.
- Designed and coded PL/SQL for ETL converting Data Stage jobs.
- Worked on State and Federal Government Projects with security clearance at NASA project.
- Worked on Payment Card Industry Data Security Standard at PayPal.
- Evaluated and tested Products for companies launching new Integration products/tools.
Skills profile:
- Databases: Oracle 11g, Informix, SQL Server, MS Access, DB2, PostgreSQL, Microsoft SQL Server 2008,
- MySQL, Hadoop
- Languages: HQL, PL/SQL, Java, XML, XSL, Pro C, C, ESQL/C, dblib, Visual Basic, SQL Plus, Documentum Content Management 4i Docbasic, Perl, bash, shell
- Data Modeling: CA Erwin Data Modeler, Power Designer, TOAD
- Operating Systems: HP-UX, Sun Solaris, UNIX, Linux, Windows, VAX/VMS, Mac OSX
- Tools: TOAD, SQL Developer, SQL Navigator, Test Director, Visio, Power Point, Oracle Enterprise Manager, Informix ACE Report Writer, SQL Loader, Explain Plan, Import/Export, MS Project, MS-Office, Crystal Report Writer, BRIO Portal, PVCS- Dimensions, NIKU, PGP, SAS adapter for IBM Data Stage, SharePoint, JIRA, PayPal Data Trek, Bodhi, Code Search, VMware
- Methodology: Velocity Informatica 9.0, SDLC, Waterfall, Agile.
- ETL/MDM Tools: Informatica Power Center, Information Analyzer, Informatica Data Quality Profiler, IBM Data Stage, Meta Stage, Quality Stage and Data Profiler, bash, PL/SQL, IBM Information Server-MDM
- EAI Tool: CBeyond
- Scheduler: cron tab, Data stage scheduler, Informatica scheduler, Autosys
Business Domains CRM Siebel, Vantive 6.0 Help Desk Call Center , Aurum BAAN Sales Force Automation, GAP Retail , Williams Sonoma RETEK Retail Merchandising System , Pfizer, Scios Biotech/Healthcare/Pharmacy, Wells Fargo Finance-Wholesale and Retail Products , Bellsouth, Southwestern Bell AT T Telecom, Allianz Property Casualty Insurance, NASA Ames Research Center Aerospace , Wage Compensation Medical Billing State of CA , Online Payment System Ebay-PayPal
PROFESSIONAL EXPERIENCE:
Confidential
Data Warehouse
Confidential Team on ETL pipelines for moving data from sources and staging to Targets. Wiki created for all the BI pipelines. Offers pipelines include online and in store ad campaigns allowing real time bids. SMB pipelines include revenue pipelines for Small and Medium Businesses. Revenue Analytics Pipeline for Sales Employees at Facebook includes bookings and revenue pipelines per region per sales person week over week with quarter totals for DSO Direct Sales and MMS Mid Market dashboards using Sales force and GL as sources. The dashboards also include forecast Goals and Goals for the quarter. Purchase and Spend pipelines for transactions include chargebacks from PayPal.Responsibilities included modifying existing fbetl python scripts to improve performance and creating new mappings.
Environment: Oracle 11g, Hadoop, Hive Query Language HQL , MySQL, fbetl Facebook ETL Python based framework, Chronos Scheduler .
Confidential
Data Architect
- Project Lead with a team of 5 offshore and 4 onshore mappers for doing the target to source mappings.
- Working with the PayPal Data mart design for History Services Data Model HSDS and architecture for Financial Payments. HSDS provides online payment data for customers and merchants 30 minutes delay from source Payments 2.0 Virtual Objects System in data mart fact tables. Legacy data is inserted from Payments 1.0 OLTP system as initial load in HSDS. Data will be shard into different region and partitioned and sub partitioned within regions with dimensions in every region supporting multiple currencies, accounts and financial instruments ach, credit card, debit card etc. for activities, transactions, money flows, fees, payouts, refunds and reversals. PayPal treasury transactions were handled differently as compared to the customer transactions. Golden Gate is used for capturing and delivering updates of critical information as the changes occur.
- Worked in Merchant Reporting, Subscriptions and US Monthly Transaction Statements for design and mapping efforts which will finally be reported from HSDS. Working very closely with all SMEs Subject Matter Experts in payment 1.0 and 2.0, user, bill me later BML , fraud group for detailed requirements and understanding business needs for BI reporting using services.
- Worked with QA team for setting up test environment to run payment use cases to check for Payment Flows.
Environment: Oracle 11g, MS SQL Server 2008, Power Designer, SharePoint, Power Point, Visio, PayPal Home Grown Tools include Data Trek, Code Search, Bodhi, Informatica Power Center 9.0, Golden Gate
Confidential
Project Lead Engineer/Data Architect
- Project Lead with a team of 50 onshore and 70 onshore developers and QA for migration efforts of data from Wachovia to Wells Fargo.
- Worked on Wachovia Wells Fargo Merger Integration Project for Wholesale Products with two offshore/onshore vendors Persistent and CSC.
- Responsible for onshore Migration Team for Wells Fargo for go forward product entitlements at company, product, user, account, and services level from Microsoft SQL Server to Oracle for templates and payments using Informatica Power Center 8.0 with Oracle as backend.
- Member of Product team validating BRD's and CR's, checking mapping documents and Informatica mappings conform to business requirements provided by both vendors for approvals.
- Designed and coded to build reporting tables for business users for entitlements reports for product teams. Created audit logs and audit errors for auditing every run in different environments.
- Conducted detailed code walkthrough on UNIX scripts, PL SQL code and Informatica Mappings. Review of entire Game plan for different runs. Worked on the performance tuning of the code as well as suggested changes for re architecting mappings. Wholesale Products worked on include ACH Payments, ACH DR, ACH INQ, TIR, WIRES, IPP, RPP, ARP, Lockbox, WFED, WLI, Basic Banking, Event Messaging, Desk Top Deposit DTD , RDC, Credit Management, BAI/SWIFT FTS , SIS, IMAGE, SN, Gift Card, Payment Manager, DAS TO DES, FXOL, FUS, WIRE INQ, WDE, International ACH XACH , WDE, SWEEPS, SELF ADMIN and EBOX.
- Worked with QA team for validating development code. Scripts written by QA team developed independently of development team based on BRDs.
Environment: Oracle 10g, Power Center, Erwin, SharePoint, Power Point, Visio, UNIX Scripting, Microsoft SQL Server, Informatica 9.0
Confidential
Data Architect / Data Modeler/Project Lead
- Worked on NASA on Air Traffic Management Systems ATM data warehousing project using PostgreSQL, MySQL, Oracle 11g, Erwin dimensional modeling including OLTP metadata modeling, Java 1.6 on Red Hat Linux 5.2, Mac OSX, bash and Perl scripting.
- Responsibilities also include working at Data Center for Infrastructure setup of database servers and NAS storage servers for Moffet and DFW NASA locations.
Project 1:
- ATM plays a critical role in National Airspace System NAS , as their flight planning decisions have a direct impact on the efficiency and safety of the resultant traffic flows and on contingency plans to deal with possible events that could arise while en route. Their decisions also have an important impact on the operating costs for an airline.
- Designed ATM Dimensional Bill Inmon' approach and OLTP Metadata model with partitioning by airport, by month, and by day using sub partition and by id using hash. Created audit logs and audit errors for auditing data and metadata including using SHA-2 cryptographic hash function for security.
- Infrastructure architecture including set ups for installs supporting multiple airports DFW, ATL etc. with main hub as NASA Moffet Ames.
- Integrated Surface Operations Data Analysis and Adaptation SODAA tool with Oracle 11g which stores airport surface adaptations and terminal area data for different airports. SODAA capabilities facilitate searching, visualizing, and analyzing airport surface and terminal area data, with the goal of improving understanding of airport surface and terminal area operations.
- Migrated existing PostgreSQL data to Oracle 11g and used MySQL to store metadata.
- Worked very closely with various NASA business groups for business requirements and data analysis for ATM data.
- Worked as a DBA installing Oracle database on different servers, creating schemas, monitoring and tuning.
- Worked on POC on Data Guard.
Project 2:
- NASA NextGen Air Traffic Management System Near Real time Data warehouse includes sources : LDM- Load data Manager, RUC- Rapid Update Cycle, CIWS Corridor Integrated Weather System, Center Radar Capture, TRACON Radar Capture, ASDI- Aircraft Situation Display to Industry, CTAS- Center TRACON Automation System includes CM SIM data, NFDC - National Flight Data Center, ACES - Adaptation Controlled Environment System and ITWS Integrated Terminal Weather Systems, OTTR - Operational TMA/TBFM Repository Report of National TMA Metering Usage and Flow Configuration Information, SMS CM file, ASCII-based output file, one file per adaptation set per 6 hours, AODB Airport Operational Database, one file per airport per system restart generally restarted daily , METAR once an hour between 50 minutes past the hour and the top of the next hour. All observations taken within this time are considered to be for the same cycle, ASDE-X binary output - one file per day. The binary weather data files are stored in BLOBS with partitioning by month including metadata tables and accessed using Web Services as a front end for researchers and analysts. Large initial data volumes to the tune of 18 terabytes volume are loaded in separate data marts with monthly volumes of 1 terabyte.
- Designed NASA NextGen Air Traffic Management System based on Ralph Kimball' approach using Oracle 11g with partitioning by month, sub partitioning by Id and indexing by day.
- Architected initial and daily data loads using bash scripting data translation layer to load data from file system to landing zone and from then to the Oracle Data warehouse.
- Created a data access layer for enterprise data includes metadata and for SODAA.
- Linux sources files from DFW are synchronized on Ames Servers using rsync and landing zone files in Ames data center are also synchronized to DFW servers using rsync.
- Created Metadata files which are loaded on the database which includes research metadata and technical load information.
- Worked very closely with various NASA Weather business groups for business requirements and data analysis for collecting all source weather data.
Environment: Oracle 11g, Bash, Perl, PostgreSQL, SharePoint, Power Point, Visio, MySQL
Confidential
Product Design Specialist/ETL Architect/Data Architect
- Worked as a Design Specialist on an Enterprise Metrics Data Warehousing Project for Property Casualty insurance snow flaked schema as well as an infrastructure architect for the ETL environment for development, test, and certification and production environment.
- Involved in managing Data Tools group for different Tools like PGP, IBM Meta Stage, Informatica, SAS and DB2.
- Worked on requirements, data architecture, data integration tool selection, project plan and budgeting of MDM Master Data Management including data governance, solution for consolidating customer and product data at enterprise level.
- Created best practices documents for requirements, mappings and standards for naming conventions for database and Informatica mappings/fields.
Environment: DB2, SAS, Data Stage 7.1.3, Quality Stage 5.0, Meta Stage 5.0, Data Profiler, All Fusion Erwin Data Modeling 7.2, Informatica 7.1, MS Project, Informatica Data Quality Profiler, IBM InfoSphere- Information Server-MDM
Confidential
Data Warehouse Architect/Informatica Architect
- Architected Informatica tools Power Center and Power Analyzer on different servers including user and security setup for different environments. Involved in the setup of repositories, promoting code on different environments and versioning of objects.
- Involved in business analysis and modeling of HTS mart high throughput screen data - experiments and mapping the source and target matrices, data cleansing and data profiling activities at source and target objects.
- Worked with ETL team for loading data from various sources into Staging / Data Warehouse and Data Marts using Informatica Power Center 7.1.x Repository Manager, Designer, Server Manager, Workflow Manager, and Workflow Monitor
- Extensively used Informatica debugger to validate mappings and to gain troubleshooting information about data and error conditions.
- Coded for loading large volumes of data in fact tables to the tune of 1.2 billion rows using source partitioning and parallel partitioning at the target. Used multiple sessions with bulk load option for performance
- Performance tuning for power center included design changes, splitting mappings into multiple mappings, changing cache values, having multiple sessions, using parallel partitions at different stages of the mappings at session level and using only fields needed at the target level from the source.
Environment: Oracle 9i, Toad, HP UX, Sun Solaris, Windows 2000, Informatica Power Center Advanced Edition AE 7.1.x, Power Analyzer, Informatica Data Profiler, Erwin Data Modeling
Confidential
- Responsible for ETL process includes data extraction, cleansing, aggregation, reorganization, transformation, calculation, and loading assay data. Responsibilities included data modeling using TOAD to implement star schema data warehouse and data marts for experimental data using Oracle. Created search capabilities to drill up data from the data mart to the data warehouse. Created dynamic pivot for data mart was very efficient since the source data was run in multiple threads to process the data efficiently and made it scalable.
Confidential
Project Engineer
- RETEK Customer Order Management RCOM business-to-consumer solution provides a completed integrated, Oracle-based and comprehensive enterprise solution for direct-to-consumer-order management needs.
- Retek software integrates order management functions: call handling, order-entry, customer service, fulfillment and accounting functions through a seamless, unified near real-time process.
- RETEK Merchandising System RMS : includes key functions such as inventory management, purchasing, pricing, promotions, management and replenishment.
- Responsibilities include integration of RMS and RCOM, migration of data for RCOM module from AS400 to Oracle 9i and creating reference and foundation base data for the HE Hold Everything concept including business functionality testing of the module, batch processing, publish and subscribe messages over the RIB, using Pro C, SQL and shell scripting. Also involved in Production Support for RCOM and RMS and customer monthly loads to RCOM. Worked with offshore team for business requirements.
Environment: Oracle 9i, PL/SQL packages, UNIX scripting, Toad, Pro C, XML, See Beyond, PVCS- Dimensions, NIKU, Sun Solaris, Windows 2000
Confidential
- Speech recognition based solution to automate recruiting hourly workers. Responsibilities included writing SQL scripts using PostgreSQL.
Confidential
ETL/Data Architect/Project Lead Engineer
Project lead with a team size of 25 includes Developers, QA team, run team, Business, and UNIX and dba admins.
- PMAP-NG - Performance Measurement Analysis Plan - Next Generation Enterprise wide data warehouse project using Ralph Kimball' approach Star Schema designed to support BellSouth's entry into the long distance market. The PMAP-NG project captures various mechanized and non-mechanized BellSouth source system data feeds, stores the data uniformly and consistently, and then makes it available to satisfy the reporting and analytical requirements of the long distance entry effort.
- These current efforts include reporting Service Quality Measurements SQM , creating 271 filing packages, and calculating and initiating payment of Self-Effectuating Enforcement Mechanism SEEM remedies.
- The SQM reports were focused on measurements of BellSouth service to CLEC as compared to BellSouth service to BellSouth retail customers.
- The raw data is used to recreate Performance Measurement reports posted by BellSouth on the PMAP web site and enable CLEC to create custom reports and disaggregate Performance Measurement reports.
- Raw Data for Pre-Ordering, Ordering, Provisioning, Maintenance and Repair, Billing and DB updates. Raw Data Files contain detailed CLEC information about specific local service requests LSR , service orders, trouble tickets, and other items being reported in the BellSouth Performance Measurement and Analysis Platform PMAP web site. Re-design of the marts 271/SQM Service Quality Measurement /Raw data into one consolidated marts using Oracle 9i.
Confidential
- Involved with the design, code development, testing and coordination of the team as well as the support of the system.
- Coordinated with Project Managers, Business Analyst, Run Team, Quality Assurance, SCM, Auditors and Developers for final deliverables and iterative evolutions in development, regression and production environments.
- Dimensional modeling using Ralph Kimball' data warehouse approach to supporting various metrics for 271/SQM and Raw data monthly filing to the FCC/CLEC's.
- Worked with auditors for functional analysis of business data and auditing.
- Reducing Run times to 6 minutes from 4 hours 15 minutes per measure 15 minutes for SQM, 30 minutes for raw data and 180 minutes for 271 for all states, 30 minutes of raw data Validations scripts - RDVS using optimization/tuning techniques.
- Reducing expenses by 1,000,000 annually for Bellsouth by reducing data load times, resources and database maintenance by Consolidating Data marts 271, SQM and Raw Data by re-architecting the design.
- Reducing licenses charge of 1,000,000 annually for Bellsouth by replacing Data stage jobs with Oracle Packages, shell scripts and cron entries to fire dailies jobs.
- Responsible for PMAP-NG with a team of 25 developers, QA teams, run teams auditors to successfully deploy the project thereby supporting Bellsouth's entry into the long distance market in 9 states.
- Provisioning, troubles maintenance and repair . Involved in the IBM Ascential Data stage jobs enhancements and maintenance for the above as well as the re-design, development, testing and support for the same using Oracle 9i and cron to fire shell scripts. Worked with the DBA for tuning Oracle database queries tuning.
- This application included measurements/data from LENS, TAG for CLEC and RNS, ROS for BellSouth. It defines the average response time and response intervals Pre-Ordering/Ordering as the average times and number of requests responded to within certain intervals for accessing legacy data associated with appointment scheduling, service and feature availability, address verification, request for telephone numbers, and customer service records CSR using Oracle 9i.
- Application tuning - PL/SQL performance tuning by adding hints, query, and design and database changes and over looking DBA and build activities.
- Confirming to FCC regulations by writing RDVS testing scripts Raw Data Validation Scripts to test that raw data matches SQM Service Quality Measurements data and 271 data.
- Implemented automated workflows for batch processing of jobs, notifications and scheduling. Created audit logs and audit errors for auditing.
- Worked with database administrators to create composite partitioning for large volumes of data and bit mapped indexes for performance. Creation of Read Only Materialized views for aggregated data for data marts and creating pivot tables.
- QA validation scripts for data accuracy to conform to FCC and PSC regulations and CLEC's requirements.
Environment: Oracle 9i, Erwin Data Modeling, IBM Data stage, PL/SQL packages, UNIX scripting and cron,