We provide IT Staff Augmentation Services!

Data Scientist Resume

Sacramento, CA


  • Over 7+ years of IT and 4+ years in Data Extraction, Data Modeling, Data Wrangling, Statistical Modeling, Data Mining, Machine Learning and Data Visualization.
  • Hands on experienced with Machine Learning, Regression Analysis, Clustering, Boosting, Classification, Principal Component Analysis and Data Visualization Tools.
  • Hands on experience in implementing LDA, Naive Bayes and skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, neural networks, Principle Component Analysis.
  • Hands - on experienced with NLP, mining of structured, semi-structured, and unstructured data.
  • Strong programming skills with in-depth knowledge in a variety of languages such asPython, R, SAS and SQLfor data cleaning, data visualization, risk analysis and predictive analytics.
  • Worked on different data formats such as JSON, XML, CSV, TXT, XLS and performed machine learning algorithms in Python using python libraries such as Pandas, Numpy, Seaborn, Scipy, Matplotlib, Scikit-learn.
  • Extensive experience in Data Visualization including producing tables, graphs, Storytelling listings using various tools such as Tableau, MS Excel, Google Analytics.
  • Expertise in Excel Macros, PivotTables and other advanced functions.
  • Applies advanced statistical and predictive modeling techniques to build, maintain, and improve on multiple, real-time decision systems. Closely works with product managers, Service development managers, and product development team in productizing the algorithms developed.
  • 4 years of experience in Automation Testing using HP QTP (Quick Test Professional) 11.0/9.5 and UFT in development, maintenance and execution of automation scripts, creation of automation framework and standards.
  • Extensive experience on VB Script, User Defined Functions, Data Driven approach, Recovery Scenarios, Descriptive Programming, File-System-Object, Excel Object and SQL queries (for verifying the database) in the script.


Programming Languages: Python, VB scripting, C, Java

Work Experience: Innovation, Data Mining, Feature Engineering, Inferential Analysis, Exploratory Data Analysis, Predictive Analysis.

IDE and OS: Jupyter, Jupyter lab, Anaconda, Windows, LINUX, Pycharm

Visualization tools: Tableau, Google Data Studio

Machine Learning: Regression models, Decision Trees, Clustering, Bayesian Statistics, Neural Nets, Model Evaluation (k-fold Cross-Validation), NLP, Computer Vision, Deployment, Risk Modelling.

Python libraries: Scikit-learn,Numpy, Pandas,Scipy, Matplotlib, Seaborn, Web Scraping.

Methodologies: SDLC, Agile/Scrum Methodology, Sprint.

Testing Tools: Hp-Quick test professional, Unified test professional, Quality Center.

Reporting Tools: MS Office (Word/Excel/Power point, Outlook), Google Analytics.

Big Data Technologies: Hadoop, MapReduce


Confidential, Sacramento, CA

Data Scientist


  • Developed deep understanding of the guest website experience, purchase behavior, platform and functionality usage of the customers.
  • Developed MapReduce/Spark Python modules for predictive analytics & machine learning in Hadoop.
  • Designed and developed large scale models using Logistic Regression, Random Forest,Time-series models and NLP Models.
  • Implemented Natural Language Processing algorithm (LDA, LSA) for text analytics andsearch relevance.
  • Performed DataModelling and Semantics Analysis of the keywords using Python.
  • Performed data extraction from Oracle database using PL/SQL and PySpark.
  • Utilized Spark SQL API in Py-Spark to extract and load data and perform SQL queries.
  • Performed scripting in Python,SQL, for Statistical Data Analysis.
  • Worked with various data formats including JSON, XML, CSV from different data sources including Oracle, Teradata, and DB2 databases.
  • Performed social media analytics by extracting data from Twitter using Python and TwitterAPI. Parsed JSON formatted twitter data and uploaded to the database and conducted Sentimentanalysis to understand the customer behavior.
  • Utilized Google analytics to understand the user traffic on the Target Website and prepared reports.
  • Utilized recommender systems, collaborative filtering techniques to drive Target business priorities
  • Monitored and analyzed session data to understand customer behavior and identified site issues that are adversely impacting conversion on the digital platform.
  • Performed ad-hoc analysis to gain insight into differences of various guest segments.
  • Created, maintained and customized events, hit attributes, dimensions, reports and dashboards in Tableau.

Confidential, Princeton, NJ

Data Scientist


  • Gathered, analyzed, documented and translated application requirements into data models and supported standardization of documentation.
  • Delivered and communicated research results, recommendations, opportunities to the managerial and executive teams, and implemented the techniques for priority projects.
  • Designed Regressionmodels to determine and forecast the Air Quality Index based on the historical data.
  • Developed different prediction models such as Linear Regression, Decision Tree and Support Vector, choosing the best model based on the trade-off between accuracy and interpretation.
  • Strong validation experience of data models by different measures such as RMSE, RSquaredand Adjusted Rsquared values.
  • Performed Data Analysis, Data Cleaning, features scaling, features engineering using pandas and numpy packages in Python and other languages like XML, SQL, JavaScript.
  • Collaborated with Data Warehouse team on development and maintenance using Oracle SQL, SQL Loader, PL/SQL.
  • Designed, developed and maintained daily and monthly summary, trending, benchmark reports, user stories and dashboards in Tableau Desktop.
  • Published workbooks and extract data sources to Tableau Server, implemented row-level security and scheduled automatic extract refresh.

Confidential, Moline, IL

Data Scientist


  • Analyzed data from Hadoop big data system, ingested from machine vision in field; data analysis used to create custom algorithms to improve crop turnover.
  • Constructed machine learning models using NumPy, SciPy,NLTK, SciKitLearn, MLPy, OpenCV
  • Prototyped Convolutional and Recurrent Neural Networks to do health claims analysis.
  • Setup development environment to use Docker, and used Docker to handle deployment on heterogeneous platforms such as Google and AWS Migrated large database from SQL Server to MySQL.
  • Use of machine vision in the development of both algorithms and software toolkits in image, signal and video processing.
  • Machine-learning based object detection and pattern recognition in 3D imaging applications using OpenCV, SciKit-Image and SciKit-Learn.
  • Machine learning algorithm, segmentation of images using Deep Learning under OpenCV.
  • Classification of images and reporting using machine learning algorithm.
  • Statistical Techniques- t-tests, Regression (Multiple, Stepwise, Logistic, Cox), Time Series, Principal Component Analysis
  • Data Mining/Machine Learning Techniques- Decision Trees (C&RT, CHAID, C5.0), Cluster Analysis, Artificial Neural Networks, Association Rule Mining, PMML deployments of DM models


Automation Team lead


  • Assigning Automation of test cases using QC/QTP.
  • Identification of test cases to be automated for regression testing in BPT frame work.
  • Creating components based on new features and then creating test-plans in QC.
  • Creating effective Test reports by evaluating Test results, PMR, MOM, etc.Involved in completing Test planning, creating and executing automation test cases for Applications like CRDB and OTP.
  • Review of automated test cases, Execution results and defect logs.
  • Report any issues in requirements and Design documents of new features and regression scenarios.
  • Performed walk through of high-level automation test scenarios with onsite coordinator and QA team.


Software Engineer


  • Involved in Test planning, Designing, Test bed and Test Data preparation
  • Actively involved in reviews of automated test scripts, maintenance of test scripts as per changes and updates in the application.
  • Identification of test cases to be automated.
  • Performed Functional and GUI Testing.
  • Handled Automated Regression Test Execution that involved creation of Regression test suite, Regression Test Execution, creation of regression execution test results.
  • Client interaction by means of Status Calls and WebEx sessions.

Hire Now