We provide IT Staff Augmentation Services!

Data Scientist Resume

Bellevue, WA


  • Overall 10+ years of IT experience in Business Analytics, Business Intelligence (BI) Systems, Data Mining, Reporting, Marketing Analytics, Data Advisory & Decision support systems.
  • 5+ years of hands - on Data science/Machine learning experience across a variety of business contexts and data sources. Proficient in quickly extracting hidden insights from the data and building useful models by leveraging a repertoire of diverse and deep technical skills.
  • Strong and expert level experience in using R, Python, Scikit-learn and SQL.
  • Strong Data Visualization skills in communicating the statistical findings to Business users and roll out the Insights in to day to day operations using Tableau, Lumira and Power BI
  • Good working knowledge on Hadoop Ecosystem
  • Good functional Knowledge on Procurement, ITES, Manufacturing & Leasing Domains.


  • Programming Languages: R Programming, Python, Scikit - Learn, SQL, BASE SAS,VBA
  • Tools: /Machine Learning Platforms: Confidential Azure ML Studio, SAP HANA PAL
  • Machine Learning techniques: Linear Regression (LASSOO, Ridge, Elastic net), Classification (Logistic Regression, LDA, Na ve Bayes, KNN,SVM), Decision trees (XgBoost & RF), PCA,Association rules & Recommendation Engines, Survival Analysis . Good in conducting Exploratory Analysis(EDA)
  • Big Data Eco System: HADOOP3.0, RHadoop,H20
  • Relational Databases: ORACLE 10g, MySQL
  • Data Visualization: R ggplot2, Python Matplotlib, Tableau Desktop 8.2, SAP Lumira, Pentaho.
  • Web Analytics: Google Analytics


Data Scientist

Confidential, Bellevue, WA


  • Discount Prediction: Analyzed historical data of Discounts offered to Cloud and Non-Cloud accounts. Identified key Insights on how Discounts are varying by Region, Product Segment, Vertical, Contract type etc. Developed a Predictive model in Azure ML Studio to estimate the Discount that can be offered for a successful renewal, based on the key KPI’s associated with that account by using Boosted Decision trees.
  • Deal Velocity: Developed a Regression model to Predict the no of days for a deal to get closed for various class of products based on Historical CRM data such as negotiation days, solution days, leakage etc.…
  • Transit Cancellations: Analyzed data from past 4 years and arrived at a model to predict the cancellations and No -shows of transit bookings by Employees. Used Multiclass Random Forest in Azure ML to develop the model. Also, arrived at Schedule optimization based on Utilization levels of different shuttle ID’ across the day to reduce wastage.
  • Time Series Forecasting: Built various Univariate and Multivariate Time series models (ets, stl, Naïve, Holt, Holt-Winter ARIMA, ARIMAX, VAR models) to Forecast Sales for various products. Forecast COGS & OPEX for different organizations and teams. Automated the forecasting process in Azure ML. Also, developed vendor Spend, Hiring and Attrition models.

Technologies used: R, Azure ML, SQL, Power BI

Data Scientist

Confidential, Palo Alto, CA


  • Confidential is the world’s largest business commerce network connecting buyers and suppliers around the world over a cloud Platform with products supporting across the entire procurement life cycle.20 K buyers and 1.7 million Suppliers transact over network amounting to $700 Billion of commerce (Around 1% of world’s GDP).

Projects Handled & Responsibilities:

  • Renewal Subscription Predictions:
  • Statistical modelling to predict the Renewal probability for various cloud products & solutions using R & Lumira. Used Logistic regression, LDA & random forest decision trees.
  • Unsupervised learning using K means Clustering in understanding the behavioral differences between accounts renewed & not and explaining the insights to Account Mangers.
  • Roll out the implementation to use the statistical insights by field teams by incorporating the predicted scores & risk scores in sales/CRM systems.
  • Supplier Risk identification: Regression model developed to predict the transaction fee paid by supplier. Based on the model predictions, the suppliers at risk were identified based on the extent of decrease in transactions by them with Buyers. Used PCA, LASSOO & Decision trees in R
  • Extracted Spend data (Accounts payable) for all the Ariba buyers from different sources using VB scripts and built a Spend Insights DB in SAP HANA. Compared this with Ariba Network data and identified gaps in terms of spend & transaction volume to be further pushed through Ariba Systems.

Technologies used: R, Python, Lumira, H20,SAP Business Objects, Tableau Desktop.

Business Analytics Consultant

Confidential, St .Louis, MO


  • Providing customer analytics throughout the Customer Lifecycle phases of Acquire, Develop & Retain for prospects and customers.
  • Developing customer segmentation analysis and created campaign planning tools for use by country marketing managers
  • Helped drive farming/hunting strategies for the inside sales teams through customer segmentation, customer repurchase probability models.

Technologies used: R, Oracle 10g, Confidential Dynamics CRM

Data Analyst


  • Worked for project WEEKLY SPEND TRACKER. Responsibilities include Clean, aggregate, analyze, interpret data, carry out quality analysis of it and Preparation of Week over Week growth analysis report for Retail product segments by performing Advanced statistical Techniques(Cluster, Factor & Tree) using SAS, SQL and STAT.


SAS Consultant, Chennai-India & Frankfurt


  • Worked for German Leasing Gaint Deutsche Leasing. The leasing application handles financial leasing, service contract management as well as asset management system, product management, re-financing, and collection with unified customer master.
  • The Analytics Team supports the Decision making throughout the lifecycle of a contract or life of an asset using SAS products Base SAS, Macro, STAT & Connect.
  • Analyzed the requirements, extracted the data from various sources and developed SAS code using Macros. Produced quality customized reports, descriptive statistics and E-Mail reports through SAS Programs.
  • Exposure to handling large quantities of data (over 0.5 million observations with >100 variables.

Technologies used: BASE SAS 9.1, SAS ENTERPRISE GUIDE, ORACLE 10g

Hire Now