Investment Data Marts BI Consultant Resume

SUMMARY:

A Big Data Analytics Developer with Security Clearance and over 20 years of experience in delivering feasible cross - platform Big Data Analytics, Semantic Layer, MDM, ETL and Metadata Management solutions, steering and managing of Big Data Analytics, EDW and BI projects for extremely large and complex data vaults for major financial, telecommunication, public and energy corporations, mostly.Experienced in Hadoop, HDFS, Spark Streaming, Spark MLlib Machine Learning Library, Hive, HiveQL, HBase, Impala, Cassandra, NoSQL, Scala, Python, R, Java, Sqoop, Oozie, ErWin, Power Designer, ETL, InfoSphere Data Stage, Quality Stage, MDM, DB2 z/OS, DB2 LUW, DB2 IDAA, Oracle, Teradata and Cognos
As a HBase Developer has lead a Big Data Analytics team and delivered key feasible analytics solutions for CRA, Canadian Revenue Agency, in Hadoop, HDFS, Spark Streaming, MapReduce, Spark MLlib Machine Learning Library, Hive, HiveQL, HBase, Impala, Cassandra, NoSQL, Kafka, Scala, Python, R, Java, Sqoop, Oozie, DB2 z/OS, DB2 LUW and DB2 IDAA environment

PROFESSIONAL EXPERIENCE:

Confidential

Investment Data Marts BI Consultant

Responsibilities:

Lead and Managed the CRA, Canadian Revenue Agency, HBase Big Data Analytics development team
Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered HBase Big Data Analytics architecture based on Zachman Framework
Participated in the Information Management team on planning process to understand projects alignment with the enterprise Big Data Analytics systems roadmap
Closely worked with Data Governance team to ensure data standards and reviewed the architectural artifacts for projects
Closely worked with technical and business teams to provide necessary advisory services for Big Data Analytics management tools and procedures
Developed and managed HBase table regions, column families, rows, column qualifiers and cells data structure hierarchy into Conceptual and Physical key value multidimensional map model, improving partitioning performances with a bucketing concept
Designed, identified, built and documented Apache Impala Metastore service in Apache Derby
Prepared, installed and configured distributed HBase with RegionServers, ZooKeeper QuorumPeers and backup HMaster servers on multiple servers in the cluster
Designed and Developed collecting relevant data from social media JSON objects into HBase data structures with REST API web services
Set Impala as a HBase underlying data structures bridge for a data exchange and integration with DB2
Used Phoenix as an SQL to HBase scans bridge to produce regular JDBC result sets for a data exchange and integration with legacy BI tools
Designed, implemented, optimized and documented NoSQL HBase Big Data Ingestion with MapReduce, Spark Streaming, Scala, Python, PySpark, Java, Sqoop, Oozie, NiFi, Phoenix, HiveQL
Developed efficient and optimized Scala and Python code mastering Spark Streaming complexity in a broader Hadoop and Hbase data management platforms to connect, ingest, govern, secure and manage data for successful Spark Machine Learning models and near to real-time analytics
Developed Scala and Python the best coding practice for HBase Big Data Ingestion with Spark Streaming as reusable code standards for coaching less experienced developers
Designed, implemented, optimized and documented Big Data Analytics asynchronously triggered off Data Pipeline between OMEGAMON Performance Expert (OMPE), DB2 z/OS and Hadoop and HBase with Apache Kafka publishing and consuming Streaming Platform, RESTful API Web services, Scala and Python
Designed, implemented and documented Apache Spark Mllib Machine Learning Library scenarios for Classification, Clustering and Collaborative Filtering algorithms for CRA Tax fraud and anomaly detection with Scala, Python and R
Designed and Developed near real time Spark Machine Learning Classification, Clustering and Collaborative Filtering Tax fraud and anomaly detection models update with social media relevant data from HBase datastore with Spark Streaming with Scala, Python and R
Developed the best coding practice for Apache Spark Mllib Machine Learning Library scenarios for Classification, Clustering and Collaborative Filtering algorithms as reusable Scala and Python code standards for coaching less experienced developers
Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including Git version control system, application patching, monitoring utilization and recommending upgrades
Managed and Coached data architect, data modelers, developers, analysts, managers and executives

Confidential

HBase Developer

Responsibilities:

Lead and Managed the Confidential HBase Big Data Analytics development team
Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered HBase Big Data Analytics architecture based on Zachman Framework
Participated in the Information Management team on planning process to understand projects alignment with the enterprise Big Data Analytics systems roadmap
Closely worked with Data Governance team to ensure data standards and reviewed the architectural artifacts for projects
Closely worked with technical and business teams to provide necessary advisory services for Big Data Analytics management tools and procedures
Developed and managed HBase table regions, column families, rows, column qualifiers and cells data structure hierarchy into Conceptual and Physical key value multidimensional map model, improving partitioning performances with a bucketing concept
Designed, identified, built and documented Apache Impala Metastore service in Apache Derby
Prepared, installed and configured distributed HBase with RegionServers, ZooKeeper QuorumPeers and backup HMaster servers on multiple servers in the cluster
Designed and Developed collecting relevant data from social media JSON objects into HBase data structures with REST API web services
Designed, implemented, optimized and documented NoSQL HBase Big Data Ingestion with MapReduce, Spark Streaming, Scala, Python, PySpark, Java, Sqoop, Oozie, NiFi, Phoenix, HiveQL
Developed efficient and optimized Scala and Python code mastering Spark Streaming complexity in a broader Hadoop and Hbase data management platforms to connect, ingest, govern, secure and manage data for successful Spark Machine Learning models and near to real-time analytics
Developed Scala and Python the best coding practice for HBase Big Data Ingestion with Spark Streaming as reusable code standards for coaching less experienced developers
Designed, implemented and documented Apache Spark Mllib Machine Learning Library scenarios for Classification, Clustering and Collaborative Filtering algorithms for Predict Store Sales models with Scala, Python and R
Designed and Developed near real time Spark Machine Learning Classification, Clustering and Collaborative Filtering Predict Store Sales models update with social media relevant data from HBase datastore with Spark Streaming with Scala, Python and R
Developed the best coding practice for Apache Spark Mllib Machine Learning Library scenarios for Classification, Clustering and Collaborative Filtering algorithms as reusable Scala and Python code standards for coaching less experienced developers
Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including Git version control system, application patching, monitoring utilization and recommending upgrades
Managed and Coached data architect, data modelers, developers, analysts, managers and executives

Confidential

Master Data Management Consultant

Responsibilities:

Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered enterprise MDM architecture based on Zachman Framework
Defined and documented in Microsoft Project and Visio Waterfall Process Milestones: Define, Measure, Analyze, Improve, Control and Waterfall Process Core Tasks: Project Definition, Requirement Definition, Design, Build, Test, Implement, Transition to Operations
Closely worked with Data Governance team to ensure data standards and reviewed the architectural artifacts for projects
Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
Designed, identified, build and documented Structural (Technical), Business and Operational Metadata with IBM InfoSphere Metadata Manager & Business Glossary
Installed, Initially Populated and Administered IBM InfoSphere MDM Server, managed Data Associations, applied Rules of Visibility, configured External Validation, customized Code Tables and Error Messages for Party, Account and Product domains
Designed relational, de normalized and multi dimensional (Consolidated Data Sources, Analytical Data, Referential Data, Regulatory Compliance Data) MDM DB2 target conceptual, logical and physical data models and data structures and re engineered data sources with ErWin
Designed, implemented and documented Data Profiling with IBM InfoSphere Information Analyzer
Designed, implemented and documented Data Quality, Standardization and Matching, with IBM InfoSphere QualityStage
Designed and documented best practices and prototype Change Data Capture (CDC) for an MDM initial (bulk) and daily (delta) load
Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for a OCIF, AML, and other transactional data sources initial load and daily feeds for Party, Account and Product domains with IBM CDC for zOS, IBM InfoSphere DataStage and CDC Stage
Designed, implemented, optimized and documented MDM Transactional Services, web services, with WSDL, XML, IBM Web Sphere, IBM Rational RSA and RAD, Java EE - J2EE, EJB, JPA - Java Persistence API, JPQL, JDBC
Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
Managed and Coached developers, analysts, managers, and executives

Confidential

Data Mining Consultant

Responsibilities:

Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered data mining architecture based on Zachman Framework
Defined and documented in Microsoft Project and Visio Waterfall Process Milestones: Define, Measure, Analyze, Improve, Control and Waterfall Process Core Tasks: Project Definition, Requirement Definition, Design, Build, Test, Implement, Transition to Operations
Closely worked with Data Governance team to ensure data standards and review the architectural artifacts for projects
Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
Designed relational, de normalized and multi dimensional target (Consolidated Data Sources, Analytical Data, Referential Data, Regulatory Compliance Data) conceptual, logical and physical data models and data structures and re engineered data sources with mainly Power Designer and partially with ErWin (legacy models capability)
Designed and documented data and metadata management optimization and best practices for data profiling, data quality, data warehouse, ETL, data lineage and data integrity
Designed, identified, build and documented Structural (Technical), Business and Operational Metadata, Repository Information Models, metadata collection, Business Rules collection and metadata cleansing for an Enterprise Metadata Repository with IBM Info Sphere Metadata Manager & Business Glossary
Designed, implemented and documented Data Profiling with IBM Info Sphere Information Analyzer
Designed, implemented and documented Data Quality, Standardization and Matching, with IBM Info Sphere QualityStage
Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for a time series data mining Data Warehouse and Semantic Layer with IBM Info Sphere DataStage
Ported from a legacy C++ code and fully redeveloped in a DataStage technology, with Loop and Stage variables in the DataStage Transformer Stage, the Pearson's chi-square test of goodness of fit statistic, including numerical calculation of the logarithm of gamma function
Improved performances of the legacy GUI, interactive, Web service statistical application based on the Java Persistence API JPA and the Java Persistence Query Language JPQL technology for 17 times
Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
Managed and Coached developers, analysts, managers, and executives

Confidential, Dallas, TX

Master Data Management Consultant

Responsibilities:

Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered enterprise architecture based on Zachman Framework and IBM MDM for a Mortgage and Investment Enterprise Data Warehouse and Semantic Layer
Defined and documented in Microsoft Project and Visio Waterfall Process Milestones: Define, Measure, Analyze, Improve, Control and Waterfall Process Core Tasks: Project Definition, Requirement Definition, Design, Build, Test, Implement, Transition to Operations
Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
Designed relational, de normalized and multi dimensional target (Consolidated Data Sources, Analytical Data, Referential Data, Regulatory Compliance Data) conceptual, logical and physical data models and data structures and re engineered data sources with mainly Power Designer and partially with ErWin (legacy models capability) on Oracle Exadata
Constructed a Pugh Matrix and directed an ETL, metadata, data profiling and data quality tool selection
Designed and documented data and metadata management optimization and best practices for data profiling, data quality, data warehouse, ETL, data lineage and data integrity with IBM InfoSphere Metadata Manager & Business Glossary
Installed, Initially Populated and Administered IBM InfoSphere MDM Server, managed Data Associations, applied Rules of Visibility, configured External Validation, customized Code Tables and Error Messages for Party, Account and Product domains
Designed, identified, build and documented Structural (Technical), Business and Operational Metadata
Designed, implemented and documented Data Profiling with IBM Info Sphere Information Analyzer
Designed, implemented and documented Data Quality, Standardization and Matching, with IBM Info Sphere QualityStage
Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for Party, Account and Product domains with IBM Info Sphere DataStage v8.1, Cognos and Oracle Exadata
Designed, implemented, optimized and documented MDM Transactional Services, web services, with WSDL, XML, IBM Web Sphere, IBM Rational RSA and RAD, Java EE - J2EE, EJB, JPA - Java Persistence API, JPQL, JDBC
Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
Managed and Coached developers, analysts, managers, and executives

Confidential

Reporting Data Warehouse Consultant

Responsibilities:

Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, directed Semantic Layer and Reporting ETL strategy, ETL architecture, ETL tool selection, ETL best practice review and large and multiple ETL project implementation
Defined and documented in Primavera and Visio Waterfall Process Legacy to SAP Interfaces Program and Projects for integration of several major legacy applications to SAP FI/CO from Ventyx PassPort, Primavera and Tempus
Defined and documented in Primavera and Visio Waterfall Process Enterprise Reporting Data Warehouse and Semantic Layer for regulatory compliance, operational and analytical reporting from the SAP FI/CO
Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
Designed Logical and Physical Model for Kimball's bottom-up and Inmon's top-down approach for relational, de normalized and multi dimensional target (Consolidated Data Sources, Analytical Data, Referential Data, Regulatory Compliance Data) conceptual, logical and physical data models and structures and re engineered data sources with ErWin
Designed and documented data and metadata management optimization and best practices for data profiling, data quality, interfaces, data warehouse, ETL, data lineage and data integrity
Designed, identified, build and documented Structural (Technical), Business and Operational Metadata, Repository Information Models, metadata collection, Business Rules collection and metadata cleansing for an Enterprise Metadata Repository with Informatica Metadata Manager & Business Glossary
Designed, implemented and documented Data Profiling with Informatica Data Explorer
Designed, implemented and documented Data Quality, Cleansing and Enriching with IDQ - Informatica Data Quality
Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for Interfaces, Data Warehouse and Semantic Layer with Cognos, Informatica Power Center, Informatica for SAP FI/CO ALE, IDoc, BAPI, RFC
Achieved daily SAP FI/CO SCORES reporting, an improvement from more than two months of manual reconciliation
Designed and built comparative data structures for real time simulation and support for corporate GAAP to IFRS transition in the SAP FI/CO, Ventyx PassPort, Primavera and Tempus ERP environment
Developed and documented Test Scripts for planning and documenting Unit, Implementation, System Regression and Acceptance testing scenarios with Unix shell script
Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
Managed and Coached developers, analysts, managers, and executives

Confidential

Investment Data Marts BI Consultant

Responsibilities:

Directed corporate ETL strategy and methodology, ETL architecture, ETL tool selection, ETL best practice review and large and multiple ETL projects implementation
Defined and documented in Microsoft Project and Visio RBC Investments Data Mart Program Milestones for an integration of five major RBC Investment data marts: Clients, Accounts, Portfolio Mix, Trade Channels, and Transactions
Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
Designed Conceptual, Logical and Physical data models for RBC Investments Data Marts data structures with ErWin
Designed and documented data and metadata management optimization and best practices for data profiling, data quality, data warehouse, ETL, data lineage and data integrity
Designed, identified, build and documented metadata collection, Business Rules collection and metadata cleansing with IBM Ascential Meta Stage
Designed, implemented and documented Data Profiling with IBM Ascential Profile Stage
Designed, implemented and documented Data Quality, Standardization and Matching, with IBM Ascential QualityStage
Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for RBC Investments Data Marts and Semantic Layer with IBM Ascential DataStage Parallel and Server and JCL
Developed and Implemented ETL Data Transformation and Custom ETL Components NCR Teradata with DataStage Teradata Stage
Developed and documented Test Scripts for planning and documenting Unit, Implementation, System Regression and Acceptance testing scenarios with over 50,000 lines of JCL
Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
Managed and Coached developers, analysts, managers, and executives

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship