Data Architect Resume
Jersey City, NJ
SUMMARY:
- A performance - driven professional with an extensive career spanning over 16 yrs in Data Warehousing Architecture, Data Model Design, ETL Flow Design, Server Architecture Design, Hardware Storage Platform, Application Development-Waterfall/scrum, Test, Implementation & Support - L1-L3.
- Worked for 14 years in New York’s Financial District & Silicon Valley.
- Built Delivery Teams in offshore & onsite Model, Managed stakeholders across geographic locations, Managed projects execution across locations, managed team scaling up/down strategies, supplier/vendor management, project costing models - T&M, Fixed Price, Managed Services, Team appraisals.
- Big Data Solution Architecture - Deep understanding of Data Volume & performance issues, loading issues, data access & distribution, Cloud computing methodologies - Hadoop, MapReduce, Pig Analytics, Hive, MDM, Database Administrations - backup/recovery issues, data planning, performance strategies, hardware planning, Virtualization - VMware, Testing using oracle VM, Oracle Express, ETL, NzAdmin 6.x, IBM Netezza 1000.
- Meta Data Management - Meta Data driven Designs.
- MDM; Data Governance; Data Profiling; Data Stewardship; Support Model; SLA; Risk Mgmt, & Monitoring planning for Strategic Data Warehouse Environment.
- ETL - Ab Initio, Informatica 8.6/9.1, OWB, Perl/Java based tools, Performance tunings, parallel loads
- Data Warehouse Design - ER-Win, Kimball Dimensional Model, Type 2/Type 1 Model loads, Column based architecture, Time series model, In-Memory Database - KDB+, TimeTen, Stored Procedures, MPP-Netezza
- Scheduler - Autosys - jil, Control-M, Crontab, Oracle dbms based.
- Unix - Linux RHL system build, shell scripting- bash/ksh/csh/AWK/SED, C, PERL scripting, Java
- Managed up to 85 people across geographic locations; planned work allocation, milestones, delivery schedules, conflicts & communications, technical issues, escalations, weekly updates to senior management.
- Project Management - JIRA, MS Project, SCRUM/sprint cycles, Plan View, Work Estimation - Critical Path Method, PERT, Resource & demand forecasting - Delphi & time-series model- Exponential Smoothing, Roadmap.
PROFESSIONAL EXPERIENCE:
Confidential, Jersey City, NJ
Data Architect
Environment: Netezza Twin Fin, Oracle 11g, Informatica 8.6/9.x, Autosys, Linux, Java/.Net
Responsibilities:
- Studied the existing system documentation and interviewed existing application support team and business users in understanding the migration plan, gaps & support needs of the existing application
- Worked with the New York & London based Programme Director team in understanding the offshore expectations and resource demand timeline. Worked in creating a roadmap for project deliveries, creating skill matrix, temporary fill using vendor (such as Accenture), estimation & sizing the project for approval and preparing the business case and proposal for presentation for approvals, when required.
- Prepared an offshore/onsite handshake delivery plan and worked on securing stakeholder support.
- Created Data Governance Model using 3 tier—Architecture & Data Quality Oversight, Development & RTB (Production Support). So we have Data Stewarts assigned for Subject-area or Business-area, who reverts back within 2 day of data issues (that failed ETL load or any other DQ issues).
- We implemented Kimball Star Schema Type-2 Data warehouse Model with about 30 dimensions and 15 facts and growing. Loading was implemented using Informatica ETL complex mapping and dynamic parameter file driven by Meta Data stored in Oracle database tables. Used the Dimensions, Degenerative Dims, Conformed Dims, SCD Type-2 Dims, Time Dims, Bridge Tables, Surrogate Keys, Bitmap Index..etc
- Most of our projects are executed on agile short cycle and few on Waterfall model. We used Agile SCRUM for 2-4 weeks sprint cycle for our Java Application team with remote Product Owner (PO) who maintains the Product Backlog (Book-of-work), as users creates JIRA items, local Scrum Master (SM) facilitates result & user stories selections for the development team. We did weekly Product Backlog Grooming sessions for estimating, streamlining and planning for next 2 sprints and daily review for update/challenges and updated Burndown Charts to track remaining work against time and how are we doing against ideal baseline.
- Work with 6 regional teams globally - Singapore, London, NY, Hongkong, Prague, and Tokyo for planning a global rollout as we try to use one code base for Oracle/Netezza or Informatica.
- I drive Solution Forum for technical implementations, Project issues, Architectural changes, weekly dimension model changes (adding new columns…etc), and performance tuning issues - SQL or Informatica - eg: using a sorter before a aggregator, optimizing Joiner Cache, partitioning, pushdown optimization..etc.
- We did a POC using Hadoop, MapReduce, Pig to compute Big Data Analytics on large Volume of data using HDFC-Hadoop Distributed File System and the results were very encouraging. We demonstrated how Map produces the key,value hash pair and magically sorts and computes analytics with high speed.
- Provide Level 2 & 3 production support for our Global Data warehouse. We developed a java based Batch Monitor that reads Autosys logs and database log tables on real time basis and provides a dashboard of all job status. Netezza Development is done using nzsql & IBM Aginity Netezza Workbench.
- Data reside in Netezza Twin Fin blades servers, Oracle 11g servers and ETL load is done using Informatica across multiple nodes in 3 different regions, mapping is triggered from Unix bash shell scripts using autosys and parameters driven from database meta data tables, that is used to create dynamic parameter file for informatica .
- Netezza tuning is done using ZoneMap, CBT, Groom, NzAdmin, and CTAS & Distribution using the correct key.
- Staffing of the delivery team: worked with vendors, reviewing CV, setting interviews, screening, deciding, rolling out offers, on boarding staff—fulltime or contract. For contract I have worked with Supplier management team on MSA-master service agreements and working on getting site clearances, compliance requirements..etc.
- Prepared the project plan and kicked off the project. This also involved setting up the content structure in confluence (for various artifacts which will be produced during the project) and components, tasks and fix versions in JIRA (for project execution and tracking). Worked with steering committee members in prioritizing book-of-work and assigning resources.
- Currently we are almost complete in Strategic Data warehouse Rollout Globally - done for 2 regions, 1 to go. All success. On completion, the new Data warehouse will and is already is the golden source of all data for regulators, compliance, internal reporting, financial-PnL reporting, regional project reporting, market data and asset class reporting...etc.
Confidential
Data Architect
Environment: Oracle 11g, Informatica 8.6, Linux, Autosys, Java, J2EE
Responsibilities:
- Met various stakeholders, project team members, Architectural teams, data source and reporting teams to understand existing issues and plans.
- My Plan was 3 fold: People, Project and Delivery on an agreed SLA or timeline. Hiring the right people, keeping them driven in their career plans, then right sizing projects, solutioning and working on delivery on agreed SLA.
- I started encouraging all teams to work on provided estimates for all tasks or projects - initially Ball Park and then fine tuning with WBS. We used various models - CCPM, PERT, Delphi (Expert Opinion)…etc. models and capturing all tasks in one book of work file location. Then working with stakeholder on prioritizing tasks and assigning resources accordingly.
- Travelling to multiple locations during project initiations, stakeholder updates,
- Providing weekly project status update to stakeholder; discussing issues & concerns or prioritizing urgent requests; discuss minor projects vs. major project cycles
- Data Governance: pro-active and reactive (SOX, Regulatory) was put in place. We recorded best practice in areas and imbibed that across teams and projects. Maintained details in intranet-wiki/conflusion.
- Also on Strategic front, Data Warehouse Governance & Control had the 3 elements: Executive Sponsorship/Oversight via fortnightly/monthly meetings, bringing the 3 parties together- data owners, data stewards, data beneficiaries via the 3rd element: process. Our Governance model helped in brokering dialogue among these constituencies. Hence data collection, management, and use could achieve optimal value.
- Worked with Architectural team in redesigning the complete Group Risk Data warehouse data model and ETL load models that have been facing serious performance issues for long time.
- Data profiling of columns data types, size, nulls, range…etc were driven by Meta data tables as pre-ETL process.
- Data Lineage or logical mapping was retained by Informatica transformations in its repository.
- Worked on getting an impact analysis across applications and departments done prior to initiating the projects.
- It was an 18 months project execution with a huge team of FTE and contractors that was executed well with minor delay due to data center delays.
- Reviewing ETL and Data Warehouse Architectures, planning change in data model, Data collection, Job scheduling via cron & Autosys, Perl & Shell scripting, Java server applications.
- On the BaU front, we had continuous requirement changes and CDR, CR that we estimated, agreed and worked on completing and implementation and post-implementation supports. Project learning was imbibed. Planning phased release of Strategic platform changes, clubbing CR into major/minor releases; re-architectures.
- Getting Hands-on Unix scripting, data base programming using PL/SQL, ETL Informatica Mapping tunings, session tunings, Autosys box jobs changes and releasing across environments.
- Hiring Staff; working with Suppliers/Vendors, inspecting vendor location as pre-compliance audit.
- The GRDW-Group Risk Data Warehouse pulls data via a Staging area using informatica mappings, which are triggered via the autosys and cron jobs. The DWH has multiple schemas for maintain the history data, current data and data loading Meta data.
- Liquidity data is fed via ETL into Treasury Data Mart to generate & compute Market Liquidity and Funding Liquidity as part of Pillar-2 Liquidity Risk Requirement.
- Post-implementation of the Data warehouse, most of work was Support related BaU that we worked on rolling out to 3rd party vendors’ locations at a much lower cost to bank.
Confidential, New York
Consultant - Technical Architect
Environment: Oracle 11g, Informatica, Linux, Autosys, Java
Responsibilities:
- Studied the existing Enterprise Data warehouse System design and worked closely with developers guiding them and doing PL/SQL, ETL, Unix or Unix scripting
- Reviewed, changed and documented Data Warehouse and ETL Architecture changes.
- Suggested changes to stored Procedures, SQL queries, tuning load process.
- Created a Data Governance Model and walked the teams on the structured approach based on Regulatory & Compliance expectations.
- Working with my Manager in project and milestone planning, resource requirement planning…etc
- Worked on Oracle Performance issues and suggested changed sizing of Oracle SGA/PGA/large Pool memory sizing for OLTP and Data Warehouse.
- I worked with the DBA team in creating the 11g upgrade strategies to help application teams in phased migration planning, testing, validation and implementation; Security and Audit Strategies for SOX and Corporate policies and documenting each of the steps.
- Finally we have one ODS, one monthly Archive online DB, one 12month archive db along with standby & failovers. Used 3NF, 2NF design methodologies for ODS design in Er-Win.
- Time Series Database Application: A Java based application used by the Cash traders on near real-time basis to fetch market data from the executions and that was really slow ~ taking 15-18 minutes for fetching 1 million records on average from KDB+ db. I worked on tuning the Java application- basically rewriting the whole application using massive multi-threading and fixing few JVM performance variables. Finally the App was running at 40 sec for 1 million fetch. That was a great performance improvement
Confidential
Data Architect & Manager, Data Warehouse Team
Environment: Oracle 10g, Informatica, Linux, Autosys, Perl, Java, Business Object XI
Responsibilities:
- Studied the existing Enterprise Data warehouse System design and then planned changes that were implemented.
- Created a Enterprise Data Warehouse Architecture encompassing Hardware, software, switches, locations- Dell Power Edge 2950 Quad core servers for 2-node RAC and 1 such Standby in remote location; storage was SAN RAID 10 disks; implemented the failover standby using Oracle Data Guard 10g, ETL using Perl/Shell scripting..etc
- Once Strategic Architecture was approved by the board and Hardware acquired, I worked with the Networking and Hardware teams in installing each of the systems and then configuring for Oracle RAC, Data Guard, Data Model, schema design, security, ETL load scripts, scheduling using cron..etc
- Data sources were Bloomberg & Reuters Market Data, 2 years worth of Tick Data (few tera bytes), internal trading database - postgrSQL DB back for last 3 years, that were recovered a loaded into the new data warehouse.
- Finally we had a PostgreSQL OLTP system and an Oracle data warehouse that had failover and remote standby DR, well tested for Production.
- Oracle Backup and Enterprise manager Monitoring Alerts and Tuning were put in place. Oracle backup using RMAN was tested and implemented. We also tested block level recoveries.
- Later I worked on implementing the BI Architecture using Business Object XI using Windows 2003 Domain Controller, Windows AD authentication for SSO (Single Sign-on), with 2-node and multiple folders to isolate data visibility and reports for the different departments—Sales, Marketing, Finance, Research, Compliance …etc
- Hiring: in the Initial stage, I requested few head counts and hired a DBA, 5 developers, a BO XI Admin.
- Post-implementation, we rolled out Support responsibilities to the Networking and Support Team. After which the team worked in BaU mode for minor change requests and Level-2-3 supports.
Confidential
Data Architect & Manager
Environment: Oracle 10g, Linux, Autosys, Perl, Java, Actimize
Responsibilities:
- Initial days were spent in getting to know the stakeholders - Desk Heads, Compliance Supervisors, RCG-Regulatory Control Group, Data Source Teams, vendors’ resource managers.
- Studied the existing systems with multiple rounds of walk-through sessions. Having meetings with the team members and creating a matrix of skills, background, interest and working relationships.
- Then working with management I presented my resource requirement to be transitioned to the new team—from NY FTE, Kean Contractors, TCS Contractors in India, Morgan Stanley employees in Mumbai…etc.
- For BaU Development and Support, I had to create an on-call ROTA for Primary/Secondary Production support list per week for our developers, where I was in escalation.
- We continued in development of existing Regulatory reports - Rule 606, OATS, Rule-92, Employee Trading, ATS-R, Reg-NMS, 1% Market Volume, Short-Sell Locate, pink sheets, 5% market making and other NYSE, NASD,AMEX,TSX reporting..etc
- Had regular meetings with project stakeholders-compliance and RCG teams along with on-site and offshore engineers and we walked through the development progress, testing strategies, result of test cases, milestones achieved, issues & concerns, new change requests from users, estimations & project impacts, risk & mitigations, regulatory audits, newer projects..Etc..etc
- Hired staff in vendor locations, working with vendors on contracts & SLA.
- Within about 10 -11 months we were able to successfully separate our systems by ring-fencing our surveillance systems with separate login id, separate servers, test & QA environments, repositories, database schemas, Unix env and schedulers.
- Later I worked on creating a Production Support Model to outsource the Level 1 & 2 to Kean in Canada as part of our low cost negotiation and savings to the bank. As part of this process, I worked on creating the Run Book, Heat Map, instructions, the support staff and being on standby for 6-8 weeks as they came up to speed.
- Once all separation and Support structure was put in place, our team was moved across Business Unit to Global Legal IT and Compliance Team under a Managing Director.
- At our new home, we found a much smaller Equities Trade Surveillance Team under a VP, that delivered using Actimize Tool and there was a decision to migrate all our reports to Actimize. With that decision and cost conscious management, I decided to find greener pasture elsewhere.
Confidential
Team Lead & Data Warehouse Architect
Environment: Oracle 9i, Ab Initio, Linux, Autosys, Perl, Java
Responsibilities:
- Initial days were spent reading Design Documents, getting to know the OLTP databases, data, tools, user’s requirements, issues & concerns, wish list ..etc.
- I proposed a POC-proof-of-concept for both Sybase IQ and Oracle and secured a Sun Fire 25K machine with 500GB of SAN space, where I created 2 partitions for Oracle 9i and Sybase IQ. Installed the server software and drivers; created Test Plan, test scripts/sql; loaded data from production environment, multiplying data many times to create the testing volume. Ran the test and captured the performance metric and presented to management. We decided on Oracle 9i as our strategic database.
- Data Warehouse Design: I proposed the Kimball Dimension Schema for the Market Data DB, which was finally implemented. We used CDC Type-2 for Historical Data retention, daily partitioning, local/global index, bitmap index—20-30+GB daily feed..Etc was implemented.
- Historical data were stored in read-only separate tablespaces.
- Data Sources were: Bloomberg & Reuter’s Market data for exchanges all over the world—North America, Asia1 and Asia2, Latin America, Europe, captured throughout the day as time zone closes.
- ETL Load process for loading Market Data feed and other internal feeds were done using AbInitio GDE graphs.
- Meta data was maintained in AbInitio Enterprise Meta>Environment (EME). It provides capability to store both business and technical metadata. EME metadata can be accessed from the Ab Initio GDE, web browser or AbInitio CoOperating system command line (air commands)
- Co>Operating System is a program provided by AbInitio which operates on the top of the operating system and is a base for all AbInitio processes.
- For Change Request, we “Check Out” from EME Data store to our individual sand box and run. We lock a graph before making any change. Used various types of parallelism - Pipeline, Data, Component; Aggregation and Rollup, reformat, SCP/FTP, Updater, Lookup File, Intermediate File, Run SQL Component, running in Phases. It’s one of the best tools for ETL and simplest to develop on.
- ETL Load: Data were loaded on end-of-day and intra-day basis, thereby having very little window for maintenance.
- We defined each project work as minor or major with a defined implementation cycle of 30 days to 3-6 months cycle. Everything starts with a BRD-Business requirement and a kickoff and agreed estimates. Then we go through Functional and Tech specs, build, unit testing, UAT, signoff and implementation.
- As part of implementation planning, we had many rounds of meeting with the Production DBA and Unix Support teams on securing their support and time. We built the run book for Level 1.
- Post-implementation we have the responsibility for Level 2 & 3 support for any production failures or data issues.
- We delivered the Oracle Market Data warehouse, ETL processes to load the data intra-day and EOD and then built the ETL process to delivery to data to users groups in various formats-csv, txt, xml, dat formats.
Confidential
Data-warehouse Developer
Responsibilities:
- Supported various applications in the Portfolio Management Division; provided Level-2 & 3 production support for database systems & batch jobs including 3 CMC support for FRITS application in Chase Manhattan Bank.
- Underwent in early version of COGNOS BI tool in the Private Banking Division; executed projects as per SDLC processes; involved in documenting & coding and performance tuning of procedures/packages.