- Seeking an opportunity to utilize considerable hands - on technical skills in the full cycle of enterprise data warehouse development (architecture, design, acquisition, normalization, enhancement, refreshment and deployment) coupled with a polished management style, extensive domain business experience and out-of-the-box thinking, to participate in an effort to deliver a world-class data platform.
- Defines the vision, strategy, and roadmap for the data management platform that meets the customer and market dynamics of a robust enterprise; and includes such architectural elements as BIG Data provisioning, Data Governance, Master Data Management (MDM), self-service BI, advanced analytics, and customized visualizations.
- Implements data warehouse and data visualization (BI and Analytics) solutions focused on heterogeneous data sources that emphasize the exploitation of BIG Data (250+ TB) using various database technologies and tools, by leading many of them as the Senior Architect.
- Works, through discreet technical projects, with organizations that require a seasoned professional who possesses broad domain experience in such fields as: Banking, Energy, Financial Services, Healthcare, Insurance, Manufacturing, Pharmaceuticals, Retail and Telecom; along with the ability to articulate the technical vision to meet those market opportunities.
- Shifts the focus and reusability of the data from sets of stand-alone operational silos to common universal interface mechanisms (API) that promote concurrent access by advanced specialized technologies that present Data Visualization, Geo-spatial, Geographic and Text mining (NLP) paradigms; as well as different analytical modeling products such as SAS, R and MATLAB.
- Develops hands-on POC solutions, using data supplied by partners and customers, intended to reflect how the desired functionality can be configured and perform, if deployed in a production environment. Typical examples, for Confidential, include P&C claims fraud prediction, Health Care provider scoring and Mortgage fraud analysis.
- Participates in efforts to define future state architectures and roadmaps including architecture standards, design patterns, guidelines, and industry best practices around cloud technologies such as those offered through IBM and AWS, focused on their adoption to support evolving business initiatives. Emphasis placed on collaboration with infrastructure and application groups to assess business readiness for cloud technology adoption, manage stakeholder’s expectations, and promote organizational change and integration into a Service Management framework.
- Identifies opportunities to expand beyond a single relational Data Warehouse structure into a Big Data hub with the integration of Hadoop (HDFS) and Data Virtualization. Specifically considered how both HDFS and Denodo published views can be an efficient complementary staging and ETL source for the existing data access.
- Possesses a proven track record of building 21 st Century shared services technology platforms that encapsulate the business intelligence and data analytics functions, sometimes from scratch.
- Years of working with teams of developers and business partners in fast-paced enterprise environments. Persistent, industrious and with a demonstrated ability to produce high- quality products. Possesses excellent problem-solving, communication and documentation skills.
- Data Architecture
- Business Intelligence - Analytics
- Data Mining
- Multi-dimensional Analytics - OLAP
- Data Modeling
- Entity Resolution and Network Analytics - MDM
- Data Visualization
- Text Mining
- Data Governance
- N-tiered Cloud Internet infrastructure
Databases: MS SQL Server, Oracle, DB2, EMC (Greenplum), MySQL, MS Access, Redshift, Snowflake and Sybase
Multi-Dimensional: MS Analysis Server (SSAS), Hyperion Essbase, OBIEE, and SAP BW
Business Intelligence: BI reporting (ODS, EDW, DM, OLAP, ROLAP, MOLAP), Qlik, Tableau
Data Movement: Data Provisioning and Transformation (ETL/ELT), MS Integration Server (SSIS), Import/Export, Talend, Informatica, SAS, Alteryx
Other: ABACUS, Meta-Data Repository (MDM), SQL Tuning and Optimization, TCP/IP, OOD
ERP: Oracle Financials, SAP R3, and Indus Passport
Operating Systems: Microsoft® Windows® for x64, Red Hat Enterprise Linux for x64, Middle Tier, JRE, Amazon Web Services (AWS), and Mainframe OS platforms
Big Data Architecture:
Enterprise Data Warehouse: Hive (SQL), Orc (Columnar), Cassandra (NO SQL)
Data Access: Pig (script), MapReduce (batch)
Data Analysis: Hue, Spark
Data Movement: Kafka, Avro, Sqoop
HDP Core: HDFS, Yarn
HDP Search: Solr
Operational Data Store: HBase
Security: Ranger, Knox, Kerberos
Operations: Oozie(scheduling), Zookeeper
Sr. Data Architect, Vice President IT
- Lead Architect, contributed to the delivery of a modern day data strategy for the Confidential (BOW) and its parent Confidential Group that encapsulates the consolidation of “Data Islands” into a coherent enterprise data architecture through the development of a 100+ Confidential Data Warehouse (EDW), and its transition from an Oracle centric platform to a Big Data hub along with the necessary support services.
- This single repository of data, at minimum, is designed to meet the regulatory compliance requirements of both Confidential Act ( Confidential ) and the Basel II accord. Specifically, it supports the regulatory insight into capital adequacy, stress testing, liquidity, and counter-party risk; as well as the business reporting requirements of the SEC and the FRB associated with securitization and the pledging of asset (loans) related collateral respectively.
- The EDW is the common location for all key information assets where conformance and data quality standards are imposed. It combines data from more than 20 different major source systems from across the lines of business and the architecture rationalizes the source system differences into a consistent model.
- The core of the data hub is the data architecture that the tools implement rather than HDFS. We visualize a data architecture that is not limited and required by the services, and vice versa.
- The EDW and Hadoop co-exist and complement each other. The hub provides for read-write use at each level of the data hierarchy (raw, conformed, and access).
- The platform addresses the demands of cloud integration w/ hybrid environment, using the identified variety of data sources with multi-structured format and organized through a data catalog with search capability. It incorporates virtualization abstraction between access and distributed storage that enables selective integration between physical and virtual.
- It protects PII data and sensitive business information by deploying a metadata-only catalog over responsible physical isolation, redaction and tokenization.
- It supports the range of data latencies, in and out, from streaming to bulk.
- The platform architecture is schema on read that does not enforce requirements and models up front; and can address massively scalable volume with considerable variety.
- It offers a broad analytics capability, which uses the right data (save raw data now to analyze later).
- ITG, a professional services practice delivering practical twenty-first century business solutions that infuse innovation and strategic differentiation into new and existing product sets by incorporating advanced architectures such as predictive analytics over very large datasets.
- Researches how advanced data technologies, such as column-oriented MPP and in-memory NoSQL databases, complements and integrates effectively with an existing enterprise data warehouse.
- Typical deliverables include technology roadmaps that define a path of development, that incorporates technology platform architecture and application feature functionality, for new and/or enhanced products, services, and/or optimized business processes to exploit big data; as well as procedures and guidelines focused on security and privacy in the data warehouse with the intent to adhere to Data Governance standards around PII and HIPPA data including data use process mapping , data classification, data stewardship and data retention.
- Most recent projects focused on delivering the Architectural elements needed to support the Final Rule required by section 165 of the Confidential Wall Street Reform and Consumer Protection Act.
- These standards include liquidity, risk management, and capital adequacy.
Director Data Warehouse
- Using extensive, hands-on experience in the full cycle of modeling data preparation, achieved the goal by extending the data management platform and its warehouse to in corporate over 300 heterogeneous sources, fed through a Data Lake, and be the central repository for data used in sophisticated analytics by model developers.
- Provided BI and Analytics operating metrics through concise, accurate and attractive reports, graphs and dashboards using enterprise business intelligence/portal technologies including Qlik, Tableau and Confidential, to senior management and others in pursuit of company success. Some of the metrics included Competitive Analysis, Market Share, Market Penetration, as well as Revenue and Burn.
- Continually researched new data sources and established relationships with third-party data vendors to develop partnerships for data integration into Verisk products.
- Created templates that utilize industry standards and best practices for over 300 ETL packages to facilitate the asynchronous loading and incremental refreshment of data from diverse sources such as Operations, Customers, Partners Vendors and Public.
- Selectively used the data sources to provide model-ready analytic data sets that comprise large, complex structures (e.g. 5+ data sources, 10M+ rows, 100+ columns, multiple appended derived values).
- Hands-on developer/contributor of practical reusable artifacts such as MDM Entity Resolution (Who Is Who) and Network Analytics (Who Knows Who) based on relational set theory and using complex stored procedures.
- Achieved the results while maintaining the unshakeable commitment to not violate internal or external customer’s implicit or explicit assumptions about the privacy of the data. Evangelized protected data and privacy procedures in compliance with corporate privacy and security policies. Authored the “Verisk Analytical Framework Data Privacy and Security”, articulating the procedures VRSK follows when handling PII and HIPAA data.
- Established current and long range objectives, plans, and policies including technologies used, phases and schedules for implementation, and new product lines. Created technical budgets, project plans, allocated resources, and determined schedule of product releases and project deadlines.
- Implemented the initial production release within a period of six months employing Agile methods. Led and managed a team of 50+ highly-skilled software engineers, database programmers, systems administrators and web developers consisting of Confidential employees augmented with selectively outsourced onshore-offshore services, under unified direction and control.
- As the Chief Architect, led the development of the web service based, e-commerce based branded application solutions through the use of such technologies as Java/J2EE, XML, HTTP and RDBMS. Continues to identify architectural direction for expanding software capabilities focusing on scalability, cross-platform support and business intelligence reporting.
- Unified relevant BI technologies (e.g. data warehousing, OLAP data marts and cubes, easy to use dashboard and reporting tools) through the proprietary Java based Business Intelligence component designed to work with multi-dimensional data structures generated from software provided by Hyperion, Microsoft, SAP and Oracle; in addition to the other leading relational data base vendors (Oracle, SQL, DB2, and Sybase).
- A sample solution: Implemented a Supply Chain Management Solution (UnipollPlus) for managing liquid inventories in the Oil, Gas and Chemical Industries. It is a good example of a Confidential deployment of contemporary e-commerce systems technologies, including web site and database search, web servers, application servers, catalog and order management technology, application integration middleware, and ERP interfacing. It is an end-to-end solution allowing customers to graphically analyze bulk fluid inventories and tank telemetry data, access and manage workflow routes used to generate wireless alerts and reports, and translate supply chain events into seamless business transactions (e.g. Threshold-based Reorders), all via a standard web browser. By harnessing the power of our data collection processes, Supply Chain information, such as bulk inventory levels, usage rates, and where implemented, custody transfers, is readily available. Beyond a single enterprise view, suppliers and distributors can deliver timely information to their customers as a competitive advantage. In today's market it's critical for suppliers to be able to closely manage inventories and effectively communicate status throughout the supply chain.
Director, Professional Services
- Reporting to the CEO, created a professional services practice devoted to supplementing licensed software offering and generating necessary revenue ($5M annually, on top of license fees) commensurate with early stage company. Responsibilities included growing the professional services department from 4 to 25 employees over a two year period with a $1.9M budget.
- Exploited the thin-client certification for SAP Business Warehouse to create local language oriented “universes”. Customers can bring forward historical information from SAP BW and other existing systems into a service-oriented environment. These reusable services, which leverage the Enterprise Business Intelligence technology, simplify access to complex packaged applications, and accelerate dashboard, single-view, ad-hoc reporting, and service-oriented architecture (SOA) projects. They are also a powerful alternative to the EDI and Data transformation (ETL) services available in the marketplace.
- A sample engagement: Chief Architect and Lead Data Engineer for 21st Century version of PG&E Energy Management System (EMS). The computers monitoring the electricity supply-chain feed gather (using bespoke ETL) real-time telemetry information into an Oracle data warehouse. The multiple terabyte data warehouse rapidly delivers gigabytes of historical and near real-time data, spanning 365-730 days, in the form of energy-demand reports and electrical diagrams, to 20,000 PG&E employees.