We provide IT Staff Augmentation Services!

Data Scientist Resume

Bellevue, WA


  • Over 5+ years of IT Experience in data base and data science, Machine Learning Algorithms, and Visualization.
  • Extensive experience working in various domains like Telecom, Banking, and Automobiles.
  • Experience in exploratory data analysis (EDA) using R language.
  • Experience in writing code in R and Python to manipulate data for data loads, extracts, statistical analysis, and modeling.
  • Experience in SAS
  • Hands on Experience working on Amazon Redshift.
  • Identified areas of improvement in existing business by unearthing insights by analyzing vast amount ofdatausing machine learning techniques
  • Experience in using Decision Trees, k - Nearest Neighbors, Clustering (hierarchical and k-means), Genetic algorithm, Dijkstra algorithm.
  • Designed and implemented statistical / predictive models utilizing diverse sources ofdatato predict demand, risk, and priceelasticity.
  • Conducted in-depthanalysisand predictive modelling to uncover hidden opportunities; communicate insights to the product, sales, and marketing teams.
  • Experience in Hadoop, MapReduce, MongoDB, and HDFS
  • Experience in creating different types of data visualization using R, Power BI, and Tableau.
  • Experience in Installing, Upgrading and Configuring Microsoft SQL Server.
  • Participated all stages in Agile Scrum methodologies of project management.
  • Skilled Confidential assessing client needs, working in groups, suggesting ideas that enhance efficiency and performance, implementing technology solutions, and training end users.


Languages: R (4yrs), Python (4yrs), SAS (4yrs), T-SQL ( 6yrs), SQL (6yrs), HTML (2yrs), C(6yrs), C++(6yrs)

Tools: R Studio, Microsoft Azure, Enterprise Manager, MS SQL, SSRS, SSIS, SSAS, Business Intelligence Development Studio (BI), Visual Studio 2013, Oracle, MongoDB, Amazon Redshift

BigData Ecosystems: Hadoop(4yrs), HDFS, MapReduce

Reporting Tools: MS Office 2003/2007, SQL Server Reporting Services, Crystal Reports, Power BI, Tableau.

Statistical Techniques: Machine learning (3yrs), Decision Trees, k-Nearest Neighbors, Clustering (hierarchical and k-means), Genetic algorithm, Dijkstra algorithm.


Confidential, Bellevue, WA

Data Scientist


  • This project was focused on customer segmentation based on machine learning and statistical modeling effort including buildingpredictive models and generatedataproducts to support customer segmentation.
  • Develop a pricing model for various product & services bundled offering to optimize and predict the gross margin.
  • Built priceelasticitymodel for various product and services bundled offering.
  • Developed predictive causal model using annual failure rate and standard cost basis for the new bundled service offering.
  • Design and develop analytics, machine learning models, and visualizations that drive performance and provide insights, from prototyping to production deployment and productrecommendation and allocation planning;
  • Worked with sales and Marketing team for Partner and collaborate with a cross-functional team to frame and answer important data questions. prototyping and experimenting ML/DLalgorithms and integrating into production system for different business needs.
  • Worked on Multiple datasets containing 2billion values which are structured and unstructured dataabout web applications usage and online customer surveys.
  • Good hands on experience on Amazon Redshift platform.
  • Design, built and deployed a set of python modeling APIs for customer analytics, which integrate multiplemachinelearningtechniques for various user behavior prediction and support multiple marketingsegmentation programs.
  • Segmented the customers based on demographics using K-means Clustering.
  • Explored different regression and ensemble models in machine learning to perform forecasting.
  • Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring.
  • Designed and implemented end-to-end systems forDataAnalytics and Automation, integrating custom visualization tools using R, Tableau, and Power BI.

Environment: MS SQL Server, R/R studio,SAS, Python, Redshift, MS Excel, Power BI, Tableau, T-SQL, ETL, MS Access, XML, MS office 2007, Outlook.

Confidential, Dallas, TX

Data Scientist


  • Developed algorithm to analyze sentimentaldataand detecting trend in customer usage and other services.
  • Worked on Multiple datasets containing 1billion values which are structured and unstructured data.
  • Used predictive analysis to create models of customer behavior that are correlated positively with historical data and use these models to forecast future results.
  • Explored the user API's and documented relevant API's for the project and created a work-flowplan.
  • Evaluated feature importance related to customer segmentation, preference, and prediction via Random Forest and Logistic Regression.
  • Analyzed historicaldata, documentation, supporting documentation, screen prints, and email conversations.
  • Implemented K-means and Clustering Algorithms to group users according to the usage and requirements of the services by area.
  • Showed dynamic visualization of clustering of user’s coverage and network usage based on the area by using R Studio.
  • Predicted user preference based on segmentation using General Additive Models, combined with feature clustering, to understand non-linear patterns between user segmentation and related monthly platform usage features (time seriesdata).
  • Conducted linear regression to predict the transaction volume, and distinguished frequent claimers based on MapReduce using Hadoop and R studio.

Environment: R/R studio, Python, Tableau, Hadoop, MS SQL Server 2005/ 2008, MS Access, MS Excel, Outlook, Power BI.

Confidential, Oklahoma

Data Analyst


  • Support the Debt Collection Application specifically application related Payment Imports and skip tracing of the consumer address information.
  • Designed and implemented end-to-end systems forDataAnalytics and Automation, integrating custom visualization tools using R, Hadoop, MongoDB, Tableau, and Power BI.
  • Responsible for creating Summary reports, Sub reports, Drill Down reports, Matrix reports.
  • Developed Stored Procedures for parameterized, drill-down, and drill-through reports in SSRS.
  • Formatted reports using Global Variables, Expressions, and Functions for the reports.
  • Created different graphical reports using DAX queries for better stimulation of data in Power BI.
  • Created the DTS Package through ETL Process to vendors in which records were extracts from Flat file and Excel sources and loaded daily Confidential the server.
  • Assess detailed specifications against design requirements.
  • Responsibilities taken as a production support Developer for applications as on needed basis.
  • Knowledge of financial instruments such as making payments to the account after splitting them to the accounts based on the commission rates and the bucket levels.

Environment: SQL Server 2000/2005 Enterprise Edition, SQL Enterprise manager, R/R studio, MS PowerPoint, MS Access 2000 & Windows 2003/2000 platform, DTS, SSIS, SSRS, Power BI.


Data Analyst


  • Analyze and Preparedata, identify the patterns on dataset by applying historical models.
  • Collaborating with SeniorDataScientists for understanding ofdata.
  • Performdatamanipulation,datapreparation, normalization, and predictive modeling.
  • Improve efficiency and accuracy by evaluating model in R.
  • Present the existing model to stockholders, give insights for model by using different visualization methods in Power BI.
  • Used R and Python for programming for improvement of model.
  • Upgrade the entire models for improvement of the product.
  • PerformedDatacleaning process applied Backward - Forward filling methods on dataset for handling missing values.
  • Under supervision of Sr.DataScientistperformedDataTransformation method for Rescaling and Normalizing Variables
  • Developed a predictive model and validate Neural Network Classification model for predict the feature label.
  • Performed Boosting method on predicted model for the improve efficiency of the model
  • Presented Dashboards to Higher Management for more Insights using Power BI.

Environment: R/R Studio, Python, SQL Enterprise Manager, Git Hub, Microsoft Power BI, outlook.

Hire Now