We provide IT Staff Augmentation Services!

Investment Data Marts Bi Consultant Resume

SUMMARY:

  • A Big Data Analytics Developer with Security Clearance and over 20 years of experience in delivering feasible cross - platform Big Data Analytics, Semantic Layer, MDM, ETL and Metadata Management solutions, steering and managing of Big Data Analytics, EDW and BI projects for extremely large and complex data vaults for major financial, telecommunication, public and energy corporations, mostly.Experienced in Hadoop, HDFS, Spark Streaming, Spark MLlib Machine Learning Library, Hive, HiveQL, HBase, Impala, Cassandra, NoSQL, Scala, Python, R, Java, Sqoop, Oozie, ErWin, Power Designer, ETL, InfoSphere Data Stage, Quality Stage, MDM, DB2 z/OS, DB2 LUW, DB2 IDAA, Oracle, Teradata and Cognos
  • As a HBase Developer has lead a Big Data Analytics team and delivered key feasible analytics solutions for CRA, Canadian Revenue Agency, in Hadoop, HDFS, Spark Streaming, MapReduce, Spark MLlib Machine Learning Library, Hive, HiveQL, HBase, Impala, Cassandra, NoSQL, Kafka, Scala, Python, R, Java, Sqoop, Oozie, DB2 z/OS, DB2 LUW and DB2 IDAA environment

PROFESSIONAL EXPERIENCE:

Confidential

Investment Data Marts BI Consultant

Responsibilities:

  • Lead and Managed the CRA, Canadian Revenue Agency, HBase Big Data Analytics development team
  • Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered HBase Big Data Analytics architecture based on Zachman Framework
  • Participated in the Information Management team on planning process to understand projects alignment with the enterprise Big Data Analytics systems roadmap
  • Closely worked with Data Governance team to ensure data standards and reviewed the architectural artifacts for projects
  • Closely worked with technical and business teams to provide necessary advisory services for Big Data Analytics management tools and procedures
  • Developed and managed HBase table regions, column families, rows, column qualifiers and cells data structure hierarchy into Conceptual and Physical key value multidimensional map model, improving partitioning performances with a bucketing concept
  • Designed, identified, built and documented Apache Impala Metastore service in Apache Derby
  • Prepared, installed and configured distributed HBase with RegionServers, ZooKeeper QuorumPeers and backup HMaster servers on multiple servers in the cluster
  • Designed and Developed collecting relevant data from social media JSON objects into HBase data structures with REST API web services
  • Set Impala as a HBase underlying data structures bridge for a data exchange and integration with DB2
  • Used Phoenix as an SQL to HBase scans bridge to produce regular JDBC result sets for a data exchange and integration with legacy BI tools
  • Designed, implemented, optimized and documented NoSQL HBase Big Data Ingestion with MapReduce, Spark Streaming, Scala, Python, PySpark, Java, Sqoop, Oozie, NiFi, Phoenix, HiveQL
  • Developed efficient and optimized Scala and Python code mastering Spark Streaming complexity in a broader Hadoop and Hbase data management platforms to connect, ingest, govern, secure and manage data for successful Spark Machine Learning models and near to real-time analytics
  • Developed Scala and Python the best coding practice for HBase Big Data Ingestion with Spark Streaming as reusable code standards for coaching less experienced developers
  • Designed, implemented, optimized and documented Big Data Analytics asynchronously triggered off Data Pipeline between OMEGAMON Performance Expert (OMPE), DB2 z/OS and Hadoop and HBase with Apache Kafka publishing and consuming Streaming Platform, RESTful API Web services, Scala and Python
  • Designed, implemented and documented Apache Spark Mllib Machine Learning Library scenarios for Classification, Clustering and Collaborative Filtering algorithms for CRA Tax fraud and anomaly detection with Scala, Python and R
  • Designed and Developed near real time Spark Machine Learning Classification, Clustering and Collaborative Filtering Tax fraud and anomaly detection models update with social media relevant data from HBase datastore with Spark Streaming with Scala, Python and R
  • Developed the best coding practice for Apache Spark Mllib Machine Learning Library scenarios for Classification, Clustering and Collaborative Filtering algorithms as reusable Scala and Python code standards for coaching less experienced developers
  • Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including Git version control system, application patching, monitoring utilization and recommending upgrades
  • Managed and Coached data architect, data modelers, developers, analysts, managers and executives

Confidential

HBase Developer

Responsibilities:

  • Lead and Managed the Confidential HBase Big Data Analytics development team
  • Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered HBase Big Data Analytics architecture based on Zachman Framework
  • Participated in the Information Management team on planning process to understand projects alignment with the enterprise Big Data Analytics systems roadmap
  • Closely worked with Data Governance team to ensure data standards and reviewed the architectural artifacts for projects
  • Closely worked with technical and business teams to provide necessary advisory services for Big Data Analytics management tools and procedures
  • Developed and managed HBase table regions, column families, rows, column qualifiers and cells data structure hierarchy into Conceptual and Physical key value multidimensional map model, improving partitioning performances with a bucketing concept
  • Designed, identified, built and documented Apache Impala Metastore service in Apache Derby
  • Prepared, installed and configured distributed HBase with RegionServers, ZooKeeper QuorumPeers and backup HMaster servers on multiple servers in the cluster
  • Designed and Developed collecting relevant data from social media JSON objects into HBase data structures with REST API web services
  • Designed, implemented, optimized and documented NoSQL HBase Big Data Ingestion with MapReduce, Spark Streaming, Scala, Python, PySpark, Java, Sqoop, Oozie, NiFi, Phoenix, HiveQL
  • Developed efficient and optimized Scala and Python code mastering Spark Streaming complexity in a broader Hadoop and Hbase data management platforms to connect, ingest, govern, secure and manage data for successful Spark Machine Learning models and near to real-time analytics
  • Developed Scala and Python the best coding practice for HBase Big Data Ingestion with Spark Streaming as reusable code standards for coaching less experienced developers
  • Designed, implemented and documented Apache Spark Mllib Machine Learning Library scenarios for Classification, Clustering and Collaborative Filtering algorithms for Predict Store Sales models with Scala, Python and R
  • Designed and Developed near real time Spark Machine Learning Classification, Clustering and Collaborative Filtering Predict Store Sales models update with social media relevant data from HBase datastore with Spark Streaming with Scala, Python and R
  • Developed the best coding practice for Apache Spark Mllib Machine Learning Library scenarios for Classification, Clustering and Collaborative Filtering algorithms as reusable Scala and Python code standards for coaching less experienced developers
  • Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including Git version control system, application patching, monitoring utilization and recommending upgrades
  • Managed and Coached data architect, data modelers, developers, analysts, managers and executives

Confidential

Master Data Management Consultant

Responsibilities:

  • Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered enterprise MDM architecture based on Zachman Framework
  • Defined and documented in Microsoft Project and Visio Waterfall Process Milestones: Define, Measure, Analyze, Improve, Control and Waterfall Process Core Tasks: Project Definition, Requirement Definition, Design, Build, Test, Implement, Transition to Operations
  • Closely worked with Data Governance team to ensure data standards and reviewed the architectural artifacts for projects
  • Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
  • Designed, identified, build and documented Structural (Technical), Business and Operational Metadata with IBM InfoSphere Metadata Manager & Business Glossary
  • Installed, Initially Populated and Administered IBM InfoSphere MDM Server, managed Data Associations, applied Rules of Visibility, configured External Validation, customized Code Tables and Error Messages for Party, Account and Product domains
  • Designed relational, de normalized and multi dimensional (Consolidated Data Sources, Analytical Data, Referential Data, Regulatory Compliance Data) MDM DB2 target conceptual, logical and physical data models and data structures and re engineered data sources with ErWin
  • Designed, implemented and documented Data Profiling with IBM InfoSphere Information Analyzer
  • Designed, implemented and documented Data Quality, Standardization and Matching, with IBM InfoSphere QualityStage
  • Designed and documented best practices and prototype Change Data Capture (CDC) for an MDM initial (bulk) and daily (delta) load
  • Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for a OCIF, AML, and other transactional data sources initial load and daily feeds for Party, Account and Product domains with IBM CDC for zOS, IBM InfoSphere DataStage and CDC Stage
  • Designed, implemented, optimized and documented MDM Transactional Services, web services, with WSDL, XML, IBM Web Sphere, IBM Rational RSA and RAD, Java EE - J2EE, EJB, JPA - Java Persistence API, JPQL, JDBC
  • Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
  • Managed and Coached developers, analysts, managers, and executives

Confidential

Data Mining Consultant

Responsibilities:

  • Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered data mining architecture based on Zachman Framework
  • Defined and documented in Microsoft Project and Visio Waterfall Process Milestones: Define, Measure, Analyze, Improve, Control and Waterfall Process Core Tasks: Project Definition, Requirement Definition, Design, Build, Test, Implement, Transition to Operations
  • Closely worked with Data Governance team to ensure data standards and review the architectural artifacts for projects
  • Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
  • Designed relational, de normalized and multi dimensional target (Consolidated Data Sources, Analytical Data, Referential Data, Regulatory Compliance Data) conceptual, logical and physical data models and data structures and re engineered data sources with mainly Power Designer and partially with ErWin (legacy models capability)
  • Designed and documented data and metadata management optimization and best practices for data profiling, data quality, data warehouse, ETL, data lineage and data integrity
  • Designed, identified, build and documented Structural (Technical), Business and Operational Metadata, Repository Information Models, metadata collection, Business Rules collection and metadata cleansing for an Enterprise Metadata Repository with IBM Info Sphere Metadata Manager & Business Glossary
  • Designed, implemented and documented Data Profiling with IBM Info Sphere Information Analyzer
  • Designed, implemented and documented Data Quality, Standardization and Matching, with IBM Info Sphere QualityStage
  • Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for a time series data mining Data Warehouse and Semantic Layer with IBM Info Sphere DataStage
  • Ported from a legacy C++ code and fully redeveloped in a DataStage technology, with Loop and Stage variables in the DataStage Transformer Stage, the Pearson's chi-square test of goodness of fit statistic, including numerical calculation of the logarithm of gamma function
  • Improved performances of the legacy GUI, interactive, Web service statistical application based on the Java Persistence API JPA and the Java Persistence Query Language JPQL technology for 17 times
  • Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
  • Managed and Coached developers, analysts, managers, and executives

Confidential, Dallas, TX

Master Data Management Consultant

Responsibilities:

  • Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, delivered enterprise architecture based on Zachman Framework and IBM MDM for a Mortgage and Investment Enterprise Data Warehouse and Semantic Layer
  • Defined and documented in Microsoft Project and Visio Waterfall Process Milestones: Define, Measure, Analyze, Improve, Control and Waterfall Process Core Tasks: Project Definition, Requirement Definition, Design, Build, Test, Implement, Transition to Operations
  • Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
  • Designed relational, de normalized and multi dimensional target (Consolidated Data Sources, Analytical Data, Referential Data, Regulatory Compliance Data) conceptual, logical and physical data models and data structures and re engineered data sources with mainly Power Designer and partially with ErWin (legacy models capability) on Oracle Exadata
  • Constructed a Pugh Matrix and directed an ETL, metadata, data profiling and data quality tool selection
  • Designed and documented data and metadata management optimization and best practices for data profiling, data quality, data warehouse, ETL, data lineage and data integrity with IBM InfoSphere Metadata Manager & Business Glossary
  • Installed, Initially Populated and Administered IBM InfoSphere MDM Server, managed Data Associations, applied Rules of Visibility, configured External Validation, customized Code Tables and Error Messages for Party, Account and Product domains
  • Designed, identified, build and documented Structural (Technical), Business and Operational Metadata
  • Designed, implemented and documented Data Profiling with IBM Info Sphere Information Analyzer
  • Designed, implemented and documented Data Quality, Standardization and Matching, with IBM Info Sphere QualityStage
  • Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for Party, Account and Product domains with IBM Info Sphere DataStage v8.1, Cognos and Oracle Exadata
  • Designed, implemented, optimized and documented MDM Transactional Services, web services, with WSDL, XML, IBM Web Sphere, IBM Rational RSA and RAD, Java EE - J2EE, EJB, JPA - Java Persistence API, JPQL, JDBC
  • Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
  • Managed and Coached developers, analysts, managers, and executives

Confidential

Reporting Data Warehouse Consultant

Responsibilities:

  • Using ITIL and Waterfall Process Enterprise Development Lifecycle SDLC Methodology, directed Semantic Layer and Reporting ETL strategy, ETL architecture, ETL tool selection, ETL best practice review and large and multiple ETL project implementation
  • Defined and documented in Primavera and Visio Waterfall Process Legacy to SAP Interfaces Program and Projects for integration of several major legacy applications to SAP FI/CO from Ventyx PassPort, Primavera and Tempus
  • Defined and documented in Primavera and Visio Waterfall Process Enterprise Reporting Data Warehouse and Semantic Layer for regulatory compliance, operational and analytical reporting from the SAP FI/CO
  • Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
  • Designed Logical and Physical Model for Kimball's bottom-up and Inmon's top-down approach for relational, de normalized and multi dimensional target (Consolidated Data Sources, Analytical Data, Referential Data, Regulatory Compliance Data) conceptual, logical and physical data models and structures and re engineered data sources with ErWin
  • Designed and documented data and metadata management optimization and best practices for data profiling, data quality, interfaces, data warehouse, ETL, data lineage and data integrity
  • Designed, identified, build and documented Structural (Technical), Business and Operational Metadata, Repository Information Models, metadata collection, Business Rules collection and metadata cleansing for an Enterprise Metadata Repository with Informatica Metadata Manager & Business Glossary
  • Designed, implemented and documented Data Profiling with Informatica Data Explorer
  • Designed, implemented and documented Data Quality, Cleansing and Enriching with IDQ - Informatica Data Quality
  • Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for Interfaces, Data Warehouse and Semantic Layer with Cognos, Informatica Power Center, Informatica for SAP FI/CO ALE, IDoc, BAPI, RFC
  • Achieved daily SAP FI/CO SCORES reporting, an improvement from more than two months of manual reconciliation
  • Designed and built comparative data structures for real time simulation and support for corporate GAAP to IFRS transition in the SAP FI/CO, Ventyx PassPort, Primavera and Tempus ERP environment
  • Developed and documented Test Scripts for planning and documenting Unit, Implementation, System Regression and Acceptance testing scenarios with Unix shell script
  • Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
  • Managed and Coached developers, analysts, managers, and executives

Confidential

Investment Data Marts BI Consultant

Responsibilities:

  • Directed corporate ETL strategy and methodology, ETL architecture, ETL tool selection, ETL best practice review and large and multiple ETL projects implementation
  • Defined and documented in Microsoft Project and Visio RBC Investments Data Mart Program Milestones for an integration of five major RBC Investment data marts: Clients, Accounts, Portfolio Mix, Trade Channels, and Transactions
  • Closely worked with technical and business teams and provided necessary advisory services, supporting mission critical applications in a highly integrated environment, for data management tools and procedures
  • Designed Conceptual, Logical and Physical data models for RBC Investments Data Marts data structures with ErWin
  • Designed and documented data and metadata management optimization and best practices for data profiling, data quality, data warehouse, ETL, data lineage and data integrity
  • Designed, identified, build and documented metadata collection, Business Rules collection and metadata cleansing with IBM Ascential Meta Stage
  • Designed, implemented and documented Data Profiling with IBM Ascential Profile Stage
  • Designed, implemented and documented Data Quality, Standardization and Matching, with IBM Ascential QualityStage
  • Designed, implemented, optimized and documented Massive and Complex ETL Data Processing for RBC Investments Data Marts and Semantic Layer with IBM Ascential DataStage Parallel and Server and JCL
  • Developed and Implemented ETL Data Transformation and Custom ETL Components NCR Teradata with DataStage Teradata Stage
  • Developed and documented Test Scripts for planning and documenting Unit, Implementation, System Regression and Acceptance testing scenarios with over 50,000 lines of JCL
  • Performed build activities to prepare packages for release to upstream environments for all components of solution, establishing, supporting and maintaining application environments, including application patching, monitoring utilization and recommending upgrades to meet changing reporting demands
  • Managed and Coached developers, analysts, managers, and executives

Hire Now