Senior Data Scientist Resume
Cupertino, CA
SUMMARY:
- Data Scientist with around 8+ years of hands on experience and comprehensive industry knowledge of Statistical Modeling, Data Modeling, Data Mining, Machine Learning, Natural Language Processing (NLP), Business Intelligence, and Data Visualization to solve real - world problems empirically.
- Technical expertise and business acumen necessary to translate business requirements and objectives into scalable, highly resilient and successful system solutions. Possess transferrable problem-solving skills and knowledge acquired from previous experience.
- Proficient in managing Data Science Life Cycle and building end to end data pipelines of Machine Learning modelsincluding Data Acquisition, Data Preparation, Data Manipulation, Feature Engineering, Statistical Modeling, Testing and Validation, Visualization and Reporting the insights.
- Led independent research and experimentation of new methodologies to discover insights, improvements for problems. Delivered findings and actionable results to management team through data visualization, presentation, or sessions.
- Proficient in Machine Learning, Statistical and Predictive Analytics using R, Pythonand SAS. Proficient in leveraging data from multiple sources to create reports and dashboards as a part of Data Storytelling and hands on experience in communicating business insights by dashboarding using Tableau, PowerBI, R, Python.
- Strong knowledge of Statistical methodologies such as Hypothesis Testing, T-tests, ANOVA, Monto Carlo Sampling, Time Series Analysis. Developed Machine Learning/Statistical models in R and Python using various Supervised and Unsupervised Machine Learning algorithms - Regression, Classification, Clustering, Dimension Reduction, Association Rule Learning, Natural Language Processing.
- Expertise in writing DDL/DML statements, complex queries using Sub queries and Joins, Stored Procedures in Microsoft SQL Server. Experienced in Data Modeling retaining concepts of RDBMS, Logical and Physical Data Modeling to 3rd Normal Form and developing Database schemas like - Star schema, Snow Flake schema (using Fact and Dimension Tables).
- Expertise in Data Integration, Data Cleaning, Data Analysis and Profiling, Data Import and Export using multiple ETL tools such as SQL Server, SSIS and SSAS. Expertise in using Microsoft Office including using Microsoft Excel to build Pivot Tables and Visualizations.
- Strong knowledge in all phases of SDLC (Software Development life cycle) from Analysis, Design, Development, Testing, Implementation and Maintenance. Great exposure in interacting with end-users to gather and document the requirements, project planning and scheduling. Actively participated in creation and implementation of Test plans. Strong business sense and abilities to communicate data insights to both technical and nontechnical clients. Worked in both Agile and Waterfall implementations.
- Demonstrated multi-tasking skills and ability to learn quickly under minimal supervision.Solid background in timely resolution and proven flexibility in rendering responsibilities.
SKILL:
Exploratory Data Analysis: Confirmatory Data Analysis using R and Python
Supervised learning:: Linear/Logistic Regression, Lasso, Ridge, Elastic Nets, Decision Trees, Random Forest, K-Nearest Neighbor, Na ve Bayes, Support Vector Machines, Gradient Boosting, Bayesian models, Ensemble Methods
Unsupervised learning:: Principal Component Analysis, Association Rules, Factor Analysis, Cluster analysis - K-means / Hierarchical, Market Basket Analysis.
Statistical tests:: T-tests, Chi-square analysis, Correlation tests, A/B testing, Normality tests, Residual diagnostics, ANOVA
Time Series: ARIMA, Holt winters, Exponential smoothing, Bayesian structural
Feature Engineering: Recursive Feature Elimination, Filter Methods, Relative Importance, Wrapper and Embedded methods
Textual Analytics:: Sentiment Analysis, Natural Language Processing
Sampling Methods:: Stratified sampling, Bootstrap sampling methods.
Model Tuning/Selection: K fold cross validation, AUC, Accuracy, Precision/Recall, Grid Search, Regularization
R: caret, glmnet, xgboost, dplyr, survival, rpart, ggplot, dplyr, arules, e1071
Python: numpy, pandas, scikit-learn, scipy, matplotlib, seaborn
SAS: SAS procedures
SQL Server, Postgres, MySQL, SSIS, SSAS: DDL/DML statements, Subqueries, Joins, Normalization, Entity Relationship Diagrams, Star schema / Snow flake schema (Fact and Dimension tables)
Tableau:, PowerBI, R (ggplot2), Python (Matplotlib, Seaborn)
Programming: R, Python, SAS, ASP.NET, C#
Others: Git, Software Development Life Cycle, Agile, IT Project Management, Prototyping, Gathering Functional and Non-Functional Requirements identification and analysis, SWOT Analysis, Root cause analysis, Dataflow diagrams, Use case diagrams, Technical Documentation, SAP-ERP
PROFESSIONAL EXPERIENCE:
Confidential, CUPERTINO, CA
SENIOR DATA SCIENTIST
Responsibilities:
- Developed machine learning models that predicted click propensity of users based on attributes such as user demographics, historic click behavior and other related attributes. Predicting user propensity to click helped show and place relevant features on the website
- Participated in all phases of data mining, data cleaning, data collection, developing models, validation, and visualization
- Clustered the supply chain of stores based on volume, volatility in demand and proximity to warehouses using Hierarchical clustering and identified strategies for each of the clusters to better optimize the service level to stores
- Predicted the likelihood of customer attrition by developing classification models based on customer attributes like user demographics, historic clicks, user acquisition channels etc. The models deployed in production environment helped detect churn in advance and aided sales/marketing teams plan for various retention strategies in advance like tailored promotions and custom offers
- Experimented with predictive models including Logistic Regression, Support Vector Machine, Random Forest, XGBoost algorithms and identified best models based on accuracy and explainabilityof the models
- Forecasted sales and improved accuracy by 10-20% by implementing advanced forecasting algorithms that were effective in detecting seasonality and trends in the patterns in addition to incorporating exogenous covariates. Increased accuracy helped business plan better with respect to budgeting and sales and operations planning.
- Tuned model parameters (p,d,q for ARIMA) using walk forward validation techniques for optimal model performance
- Developed and refined complex marketing mix statistical models in a team environment and worked with diverse functional groups with over $100MM in annual marketing spend
- Responsible for all stages in the modeling process, from collecting, verifying, & cleaning data to visualizing model results, presenting results, and making client recommendations
- Developed 8 customer segments using unsupervised learning techniques like KMeans and Gaussian mixture models. The clusters helped business simplify complex patterns to manageable set of 5 patterns that helped set strategic and tactical objectives pertaining to customer retention, acquisition, spend and loyalty
- Developed a machine learning system that predicted purchase probability of a particular offer based on customer’s real time location data and past purchase behavior; these predictions were used for mobile coupon pushes
Confidential, CHICAGO, IL
DATA SCIENTIST
Responsibilities:
- Used a combination of data visualization tools like Tableau and technical tools like R and SQL to query and model data to clearly and effectively explain the problem, root causes and suggest recommendations
- Built Tableau dashboards that tracked the pre and post changes in customer behavior post campaign launch; the ROI measurements helped retailer to strategically extend the campaigns to other potential markets
- Conducted Sentiment Analysis on the customer feedback pre and post chances of campaign launch which helped in analyzing and extending campaigns
- Implemented market basket algorithms from transactional data, which helped identify ads clicked together frequently. Discovering frequent ad sets helped unearth Cross sell and Upselling opportunities and led to better pricing, bundling and promotion strategies for sales and marketing team
- Projected customer lifetime values based on historic customer usage and churn rates using survival models.Understanding customer lifetime values helped business to establish strategies to selectively attract customers who tend to be more profitable. It helped business to establish appropriate marketing strategies based on customer values
- Measured the price elasticity for products that experienced price cuts and promotions using regression methods; based on the elasticity, Confidential made selective and cautious price cuts for certain licensing categories
- Developed algorithms for optimal set of Stock keeping units to be put in stores that maximized store sales, subject to business constraints; advised retailer to gauge demand transfer due to SKU deletion/addition to its assortment
- Developed a personalized coupon recommender system using recommender algorithms (collaborative filtering, low rank matrix factorization) that recommended best products to a user based on similar user profiles. The recommendations enabled users to engage better and helped improving the overall user retention rates at Confidential
- Designed and deployed real time Tableau dashboards that identified items which are most/least liked by the customers using key performance metrics that aided retailer towards better customer centric assortments. It also aided retailer towards strategies pertaining to better ad placement, bundling and assortments
Confidential, Richmond, Virginia
DATA SCIENTIST
Responsibilities:
- Built classification models using several features related to customer demographics, macroeconomic dynamics, historic payment behavior, type and size of insurance policy, credit scores and loan to value ratios and with accuracy of 95% accuracy the model predicted the likelihood of default under various stressed conditions.
- Designed and deployed real time Tableau dashboards that identified policies which are most/least liked by the customers using key performance metrics that aided Confidential for better rationalization of their product offerings
- Clustered the customers of Confidential based on demographics, health attributes, policy inclinations using hierarchical clustering models and identified strategies for each of the clusters to better optimize retention, marketing and product offering strategies
- Built executive dashboards in Tableau that measured changes in customer behavior post campaign launch; the ROI measurements helped Confidential to strategically select the effective campaigns
- Built credit risk scorecards and marketing response models using SQL and SAS. Evangelized the complex technical analysis into easily digestible reports for top executives in the company.
- Developed several interactive dashboards in Tableau to visualize nearly 2 Terabytes of credit data by designing a scalable data cube structure.
Confidential, Austin, TX
DATA MODELER/ DATA ANALYST
Responsibilities:
- Achieved a broad spectrum of end results putting into action the ability to find, and interpret rich data sources, merge data sources together, ensure consistency of data-sets, create visualizations to aid in understanding data, build mathematical models using the data, present and communicate the data insights/findings to specialists and scientists in their team
- Implemented full lifecycle in Data Modeler/Data Analyst, Data warehouses and DataMart’s with Star Schemas, Snowflake Schemas.Performed data mining on data using very complex SQL queries and discovered pattern and used extensive SQL for data profiling/analysis to provide guidance in building the data model
- Analyzed large datasets to provide strategic direction to the company. Performed quantitative analysis of ad sales trends to recommend pricing decisions.
- Conducted cost and benefit analysis on new ideas. Scrutinized and tracked customer behavior to identify trends and unmet needs.
- Developed statistical models to forecast inventory and procurement cycles. Assisted in developing internal tools for data analysis.
- Designed scalable processes to collect, manipulate, present, and analyze large datasets in production ready environment, using Akamai's big data platform