We provide IT Staff Augmentation Services!

Data Scientist Resume



Primary Languages: Python, SQL

Secondary Languages: R, Java, Tableau

Data Engineering: HDFS, Spark, Neo4J, MongoDB, JSON, API, Web Scrape

Python Packages: Sklearn, Scikit - Learn, Numpy, Pandas, Tensorflow, Matplotlib, Plotly, Seaborn

Machine Learning: K-Nearest Neighbors, Support Vector Machine, Principal Component Analysis, Density-Based Scan, Latent Dirichlet Allocation


Confidential, AI

Data Scientist


  • Developed architecture for dynamic anomaly detection and clustering in large-scale, streaming graphs
  • Developed machine-learning conversational guide based on utility of conversation
  • Including features and models drawn from domain knowledge in psychotherapy, sociology, and neuropsychology
  • Processed lyrics and generated rhyming text seeds with Neo4J and Python
  • Developed Triphone2Vec rhyme and accent detection algorithm


Data Science Interns


  • Identified new market coverage possibilities for venture lending and methods for machine learning to recommend likely deals of interest to SVB staff
  • Graph-based analysis of venture capital ‘social networks’
  • Developed recommender system capable of predicting early stage deals of interest to critical late stage investors
  • Performed random walk clustering of investor networks and validated the results via known networks
  • Trained ERGM (exponential random graphical model) as a means of investigating investor closeness
  • Utilized investor participation data and neural network to predict likelihood of negative venture debt performance as a classification problem
  • Utilized matrix decomposition for broad, sparse dataset
  • Training accuracies averaged 80%
  • Optimized sample portfolio performance via elimination of high-risk deals based on likelihood estimate and business decision-making utility
  • Test portfolio performance improved by over 10%
  • Developed query-ready graph prototype, including integrating internal and external data sources
  • Utilized Density Based-Scan to identify networks of high-value investors


Senior Financial Data Analyst/Data Operations Specialist


  • Sole analyst and data ops specialist for Confidential -mortgage mortgage backed securities and applications. Responsibilities included:
  • Validating new cashflow models against origination documents. Importing new model and data into Structured Finance Workstation deal library and SQL database. Maintaining relational database.
  • Providing quality assurance for monthly releases of data from the point of trustee release of data through SQL transformation to deal library and final published approval for client use
  • Client-facing communication, including providing domain expertise and customer service support
  • Extensive data munging in SQL and Excel
  • Trained and audited new hires and outsourcing team
  • Developing QA procedures and documentation for monthly data
  • Developed, wrote, and implemented QA logic, successfully automating 70% of approvals from a previously manual process
  • Oversaw 100% growth in the size of the Confidential library of deals
  • Oversaw universal data backfill of Confidential historical data
  • Designed and implemented projects to supplement and improve upon trustee property level data. Included extension of property default logic and development of key fields such as net-operating income and loan-to-value ratios


Research Assistant


  • Researched, compiled, formatted, and summarized data sets based on potential for use in economics textbook, selecting for a combination of student interest and relevance to econometric study


Assurance and Advisory Intern


  • Audited Charles Schwab website transition, trading process, and authorization, including trading floor process walkthrough, for Financial Services Risk Management and Internal Audit team

Hire Now