We provide IT Staff Augmentation Services!

Data Scientist Resume

Washington, DC


  • Over 5 years out of 11 years professional experience as a Quantitative Analyst /Data Scientist working across Finance, Telecom and Retail industries with a master’s degree in Science and Specialization in Mathematics and Quantitative Finance.
  • Proven ability to translate high - level objective into practical analysis and deliver actionable recommendations. Record of managing complex projects and creating solutions that work. Self-directed innovator searching for challenges.
  • Proficient in Machine Learning Techniques, R, Quantlib, Python, SAS, Tableau, SQL & Advanced Excel.
  • Developed predictive models to identify the most significant behavioral patterns that lead to a member conversion which eventually increased the ROI of 0.96 million$ per quarter
  • Re-built the existing model and increased its accuracy from 68% to 89% using Advanced Statistical Algorithms
  • Proficient in managing entire phases of CRISP-DM project life cycle including data acquisition, data cleaning, data engineering, features scaling, features engineering, statistical modeling (Decision trees, regression models, neural networks, SVM, KMeans Clustering), dimensionality reduction using Principal Component Analysis and Factor Analysis, testing and validation using ROC plot, K- fold cross validation and data visualization.
  • Good practical knowledge in performing Data Analysis process using Python like Importing datasets, Data wrangling, Exploratory Data Analysis, Model development and Model Evaluation.
  • Hands on Expertise on Classification, Regression, Time Series Data, Churn Prediction, Home Price valuation, Exit Strategies using various packages in R and Python
  • Adept and deep understanding of Statistical modeling, Multivariate Analysis, model testing, problem analysis, model comparison and validation.
  • Skilled in performing data parsing, data manipulation and data preparation with methods including describe data contents, compute descriptive statistics of data, regex, split and combine, Remap, merge, subset, reindex, melt and reshape.
  • Experience in using various packages in R and libraries in Python.
  • Ability to handle multiple tasks simultaneously.
  • Proven leader with outstanding relationship building skills and strong communication abilities
  • Highly motivated team player with ability to work cross-organizationally and manage strict deadline.
  • Extensive working experience in developing mappings in Informatica, tuning them to achieve optimal performance and migrate objects in all environments including DEV, QA testing and PROD.


Expertise: Caret, Tidyverse, MASS, GGPLOT2, Scikit-Learn, NumPy, SciPy, Deep learningRNN, CNN, Tensor flow, Keras, matplotlib, Microsoft Visual Studio, Microsoft Office

Machine Learning Algorithms: Multinomial Regression, Logistic Regression, Decision Trees, Random Forest, K Means Clustering, Support Vector Machines, Gradient Boost Machines & XGBoost

RDBMS: SQL Server 2005/2008/2012 , MySQL, Teradata

NoSQL DB: Cassandra

Frameworks: Hadoop Ecosystem, Apache Spark

Programming Languages: R, Python, Matlab, SAS, Quantlib

Platforms: RStudio, Tableau, Informatica, MicroStrategy, Toad, SAS, Eclipse, WindowsSQLdeveloper, Toad for Oracle, Microsoft SQL, Teradata, Hadoop


Confidential, Washington, DC

Data Scientist


  • Develop analytics libraries used for pricing and risk-management
  • Create, implement, and support quantitative models for the trading business leveraging a wide variety of mathematical methods tools including advanced calculus, Python, Structured Query Language (SQL), mathematical finance/ programming and statistics and probability.
  • Develop pricing models using numerical techniques for valuation the interest rate swaps and other derivatives including using Quantlib Python and SAS
  • Work in close partnership with control functions such as Legal, Compliance, Market and Credit Risk, Audit, Finance in order to ensure appropriate governance and control infrastructure.
  • Data exploration to make the data useable for building data insights and initial hypothesis.
  • Responsible for providing reports, analysis and insightful recommendations to business leaders on key performance metrics pertaining to leg performance.

Confidential, CT

Data Scientist


  • Created a dataset with data in 7-8 different repositories using SQL Teradata involving complex querying.
  • Analyzed dataset of about 1.34M records and identified trends and effective factors for data modelling.
  • Preprocessed the data and developed various visualizations using packages ggplot/choroplethr/caret in R.
  • Rebuilt the propensity model with increased accuracy from 68%-89% using Advanced ML Algorithms
  • The model estimates to increase the ROI and least estimated standard error 2.05%
  • Developed dashboards using Tableau and automated the process using SSIS for daily update
  • Documented the phase2 Analysis for the project and also designed roadmap and conducted KT sessions
  • Built multiple models using various Machine Learning Algorithms like Multinomial Regression, Decision Tree, SVM, Random Forest, Neural Networks etc., and finalized with the better model using ROC Curve
  • Used various Parameter Tuning Techniques to get better results from the model.
  • Used different methods like Univariate approach, Boxplots, Cook’s distance to find the outliers
  • Developed time series forecasting model (ARIMA) & provided strategic analysis for the business demands and supply, in which the model predicted actual demand with 92% accuracy.
  • Created Exploratory Data Analysis to identify trend, seasonality and outliers etc.
  • Managed team processes and deliverables for Ramp-up and Ramp down demand forecasts
  • Responsible for providing reports, analysis and insightful recommendations to business leaders on key performance metrics pertaining to employee performance
  • Built predictive models to identify the most significant behavioral patterns that lead to employee churn
  • Created Propensity model to identify the most influential attributes contributing to Indent/Demand cancellation


Data Scientist


  • Understanding business context and strategic plans and develop a data-driven business plan to support the attainment of business goals.
  • Created a dataset using SQL Teradata complex queries along with identification to factors to customers calling to customer care from customer care text repository.
  • Data manipulation/treatment based on nature of data (for example missing value imputation, Information Value (IV), Weight of Evidence (WOE), Data profiling, correlation matrix, relative importance between predictors, variable clustering, univariate and bivariate plots, etc.)
  • Building predictive models from start-to-finish following CRISP-DM life cycle (i.e. extract data, manipulate data, Data Profiling, build and validate model) and then deployed model using flask.
  • Preparation of final project presentation documents for overall significance of the project in a well-defined manner
  • Identify the most significant behavioral patterns that lead to customer churn and build an attrition model using Random Forest, SVM with Precision-Recall curve as evaluation metric for the model.
  • Data exploration to make the data useable for building data insights and initial hypothesis.
  • Used the analysis of behavior and characteristics of terminated employees to drew the important factor for attrition.
  • Build a model to check the factors affecting the termination of employees and their reasons for termination using various dimension reduction techniques.
  • Predict the employees who are high risk to maintain the talent pool and to effectively retain talent at every level (High Potential and High performer)
  • Responsible for working with stakeholders to troubleshoot issues, communicate to team members, leadership and stakeholders on findings to ensure models are well understood and optimized.


RM-Data Support


  • Creating output to explain data analysis, data visualization, and statistical modeling results to managers.
  • Modeling survey data responses with ordinal logistic regression.
  • Experience with working on clickstream activities, Customer Journey activities, Fraud Detection, Sales and managing Store items.
  • Analyzing and visualizing user behavior migration.
  • Created mappings to load data from source and target to staging, staging to reporting tables by applying business requirements using Informatica Power Center.
  • Applying machine learning concepts to capture insights.
  • Handled importing data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
  • Providing timely, relevant, accurate reports and analysis of the organization’s performance to facilitate decision- making towards achievement of the budget and strategic plan.
  • Documented all phases of project implementation for future reference and conducting KT sessions

Hire Now