We provide IT Staff Augmentation Services!

Metadata Management And Data Quality Consultant Resume

SUMMARY:

  • Accomplished data architect, data management engineer and NLP data scientist who helps companies increase efficiency and reduce risk
  • Well - versed in all major aspects of data governance (master data management, data quality, metadata management, data modeling) - a “universal soldier” when it comes to data management and governance needs
  • Experienced in all phases of the SDLC and in agile development methodologies

PROFESSIONAL EXPERIENCE:

Confidential, New York, NY

Metadata Management and Data Quality Consultant

  • Led the CCAR CFO attestation process automation by:
  • On-boarding necessary technical and business metadata onto the proprietary metadata platform
  • Implementing DQ rules on the same platform and working with the DQ reporting team to provide valuable insights on data quality to customers
  • Implementing the feed from Collibra Data Governance Center (in Python via REST API) to acquire business metadata to be mapped to technical assets and to be used in DQ rules
  • Assisted product management and multiple internal teams with improving the metadata and DQ platform

Confidential, New York, NY

Data Quality and Governance Consultant

  • Designed and implemented a practical data quality assurance framework using Informatica Data Quality (IDQ), SQL and other tools; outlined the direction for further development
  • Applied the new DQ framework to the ongoing MDM initiative, tested the process integration using several initial releases of the new MDM solution that was managed by Confidential
  • Defined the long-term metadata management requirements for the new DQ framework and started to collaborate with the Enterprise Data Governance group to onboard the necessary metadata onto Collibra Data Governance Center and onto Informatica Metadata Manager
  • Worked with the enterprise IT department, with the division IT department and with various business groups on the adoption of the new DQ framework and on developing other data governance standards

Confidential , Jersey City, NJ

Lead Developer and NLP Data Scientist

  • Created the system development and the operations frameworks for the private investment fund using Java, Python, PostgreSQL, Neo4j and other technologies
  • Java was used in Eclipse IDE with GitHub and Maven, utilizing core libraries like JDBC, JAXB, JavaFX, Apache HTTP Client and FTP Commons, Jersey Client
  • Python was used in Spyder IDE and in Jupyter Notebook (Anaconda distribution)
  • Built the business data foundation by designing and implementing:
  • Data acquisition from REST services, FTP sites and other sources using Java
  • Merging the data from multiple sources and data quality control using SQL and PL/pgSQL
  • Enriched the business data foundation by designing and implementing the natural language processing (NLP) capabilities:
  • Information extraction from text using Stanford Core NLP and Neo4j via Java APIs
  • Text sentiment analysis using Apache OpenNLP Categorizer with existing training data sets
  • More specialized text classification using Apache OpenNLP Categorizer and Stanford Classifier, applying the supervised machine learning techniques to create maximum entropy classification models
  • Increased the competitive differentiation of the firm by automating analytics using Python (pandas, Matplotlib, SciPy), SQL and PL/pgSQL
  • Automated operational asset management decision support using SQL, PL/pgSQL, Windows batch and PowerShell scripting

Confidential, New York, NY

Lead Information Architect and NLP System Architect

  • Increased business efficiency and reduced risk by architecting, developing and maintaining the Data Governance Center platform, the purpose of which was to provide the full catalog and the meaning of enterprise data assets, as well as to point to the locations of these assets. Specific responsibilities included:
  • Participating in product visioning, marketing and planning
  • Business process/application integration and gaining tactical sponsorship
  • Content modeling to support the needs of data governance and to align it with enterprise knowledge management
  • Developing the content management platform using Collibra
  • Developing the reporting platform using Neo4j graph database
  • User training, assisting knowledge engineers with content entry and quality control
  • Reduced the ongoing data management costs by serving as a system architect on a natural language processing (NLP) project that automated financial data collection from PDF documents using Java, Rosoka, Apache PDFBox, Tesseract OCR
  • Discovered opportunities for breakthrough achievements for the business by doing research in the areas of:
  • Physical metadata management and reporting
  • Building semantically advanced applications using Semantic Web technologies
  • Automated data quality monitoring and management
  • Reduced information-related operational costs and risks of the enterprise by:
  • Participating in defining the enterprise master data management (MDM) strategy
  • Contributing to the development of the MDM solutions as an information architect and as a solutions architect
  • Working on application development projects as a “steward” of the MDM solutions, ensuring their long-term viability
  • Strengthened enterprise data architecture and increased business relevance of solutions by serving as an enterprise data modeler on multiple projects, designing information schemas of various kinds, including relational and XML schemas

Confidential, Parsippany, NJ

DBA / ETL Developer

  • Supported several major business areas including Contract Administration, Supply Chain and Sales Operations by integrating systems using Informatica PowerCenter and other tools
  • Contributed to the development of the business by serving as a data modeler and as a data analyst on critical business initiatives, e.g. Sales Reporting
  • Provided the solid foundation for business operations by administering the ETL platform and the database platforms of the company based on MS SQL Server and Oracle in Solaris, Linux and Windows environments

Hire Now