Senior Data Scientist Resume
Sunnyvale, CA
SUMMARY:
- Over 8+ years of experience as a Data Scientist with applied statistical techniques and machine learning, including Bayesian methods, time - series modeling, classification, regression, mixture models, clustering, dimensionality reduction, model selection, feature extraction, experimental design, and choice modeling
- Fluent and well versed writing production quality code in SQL, R,Python, Spark and Scala
- Hands on experience building regression, classification, and recommender systems with large datasets in distributed systems and constrained environments
- Domain expertise in architecting and building comprehensive analytical solutions in Marketing, Sales and Operations functions across Technology, Retail and Banking industries
- Hands on experience communicating business insights by dashboarding in Tableau. Developed automated tableau dashboards that helped evaluate and evolve existing user data strategies, which include user metrics, measurement frameworks, and methods to measurement
- Strong track record of contributing to successful end-to-end analytic solutions (clarifying business objectives and hypotheses, communicating project deliverables and timelines, and informing action based on findings)
- Experience working with large data and metadata sources ; interpret and communicate insights and findings from analysis and experiments to both technical and non-technical audiences in ad, service, and business
- Proactive participation ad roadmap discussions, data science initiatives and the optimal approach to apply the underlying algorithms
- Expert knowledge in breadth of machine learning algorithms and love to find the best approach to a specific problem. Implemented several supervised and unsupervised learning algorithms such as Ensemble Methods (Random forests), Logistic Regression, Regularized Linear Regression, SVMs, Deep Neural Networks, Extreme Gradient Boosting, Decision Trees, KMeans, Gaussian Mixture Models, Hierarchical models, and time series models (ARIMA,GARCH, VARCH etc)
- Developed and deployed dashboards in Tableau and RShiny to identify trends and opportunities, surface actionable insights, and help teams set goals, forecasts and prioritization of initiatives
- Professional working experience in writing spark streaming and spark batch jobs using spark MLlib
- Experience using multiple ETL tools in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export such as Ab Initio, Alteryx and Informatica Power Center
- Hands on experience in optimizing the SQL Queries and database performance tuning in Oracle, SQL Server and Teradata databases
- Experienced in Data Modeling retaining concepts of RDBMS, Logical and Physical Data Modeling until 3NormalForm (3NF) and Multidimensional Data Modeling Schema (Star schema, Snow-Flake Modeling, Facts and dimensions)
TECHNICAL SKILLS:
Exploratory Data Analysis: Univariate/MultivariateOutlier detection, Missing value imputation, Histograms/Density estimation, EDA in Tableau
Supervised Learning: Linear/Logistic Regression, Lasso, Ridge, Elastic Nets, Decision Trees, Ensemble Methods, Random Forests, Support Vector Machines, Gradient Boosting, Deep Neural Networks, Bayesian Learning
Unsupervised Learning: Principal Component Analysis, Association Rules, Factor Analysis, K-Means, Hierarchical Clustering, Gaussian Mixture Models, Market Basket Analysis, Collaborative Filtering and Low Rank Matrix Factorization
Feature Selection: Stepwise, Recursive Feature Elimination, Relative Importance, Filter Methods, Wrapper Methods and Embedded Methods
Statistical Tests: T Test, Chi-Square tests, Stationarity tests, Auto Correlation tests, Normality tests, Residual diagnostics, Partial dependence plots and Anova
Sampling Methods: Bootstrap sampling methods and Stratified sampling
Model Tuning/Selection: Cross Validation, Walk Forward Estimation, AIC/BIC Criterions, Grid Search and Regularization
Time Series: ARIMA, Holt winters, Exponential smoothing, Bayesian structural time series
Machine Learning / Deep Learning R: caret, glmnet, forecast, xgboost, rpart, survival, arules, sqldf, dplyr, nloptr, lpSolve, ggplot
Python: pandas, numpy, scikit-learn, scipy, stats models, ggplot2,tensorflow,Caffe
SAS: Forecast server, SAS Procedures and Data Steps
Spark: MLlib, GraphX
SQL: Subqueries, joins, DDL/DML statements
Databases/ETL/Query: Teradata, SQL Server, Postgres and Hadoop (MapReduce); SQL, Hive, Pig and Alteryx
Visualization: Tableau, ggplot2 and RShiny
Prototyping: RShiny, Tableau, Balsamiq and PowerPoint
WORK EXPERIENCE:
Confidential, Sunnyvale, CA
Senior Data Scientist
Responsibilities:
- Developed and refined complex marketing mix statistical models in a team environment and worked with diverse functional groups with over $100MM in annual marketing spend
- Responsible for all stages in the modeling process, from collecting, verifying, & cleaning data to visualizing model results, presenting results, and making client recommendations
- Developed 5 customer segments using unsupervised learning techniques like KMeans and Gaussian mixture models. The clusters helped business simplify complex patterns to manageable set of 5 patterns that helped set strategic and tactical objectives pertaining to customer retention, acquisition, spend and loyalty.
- Implemented various advanced forecasting algorithms that were effective in detecting seasonality and trends in the patterns and thus helped in improving sale/demand forecast accuracy by 20-25% which helped business plan better with respect to budgeting and sales and operations planning
- Tuned model parameters (p,d,q for ARIMA) using walk forward validation techniques.
- Predicted the likelihood of customer attrition by developing classification models based on customer attributes like user demographics, historic clicks, user acquisition channels etc. The models deployed in production environment helped detect churn in advance and aided sales/marketing teams plan for various retention strategies in advance like tailored promotions and custom offers
- Implemented market basket algorithms from transactional data, which helped identify ads clicked together frequently. Discovering frequent ad sets helped unearth Cross sell and Upselling opportunities and led to better pricing, bundling and promotion strategies for sales and marketing team
- Developed machine learning models that predicted Ad click propensity of users based on attributes such as user demographics, historic click behavior and other related attributes. Predicting user propensity to click helped show and place relevant ads
- Projected customer lifetime values based on historic customer usage and churn rates using survival models. Understanding customer lifetime values helped business to establish strategies to selectively attract customers who tend to be more profitable for Confidential . It also helped business to establish appropriate marketing strategies based on customer values.
- Developed a machine learning system that predicted purchase probability of a particular offer based on customer’s real time location data and past purchase behavior; these predictions are being used for mobile coupon pushes.
Confidential, Seattle, WA
Data Scientist
Responsibilities:
- Measured the price elasticity for products that experienced price cuts and promotions using regression methods; based on the elasticity, group on made selective and cautious price cuts for certain licensing categories.
- Developed algorithms for optimal set of Stock keeping units to be put in stores that maximized store sales, subject to business constraints; advised retailer to gauge demand transfer due to SKU deletion/addition to its assortment.
- Developed a personalized coupon recommender system using recommender algorithms (collaborative filtering, low rank matrix factorization) that recommended best offers to a user based on similar user profiles. The recommendations enabled users to engage better and helped improving the overall user retention rates at Confidential
- Clustered the supply chain of Confidential stores based on volume, volatility in demand and proximity to warehouses using Hierarchical clustering models and identified strategies for each of the clusters to better optimize the service level to stores
- Built Tableau dashboards that tracked the pre and post changes in customer behavior post campaign launch; the ROI measurements helped retailer to strategically extend the campaigns to other potential markets
- Designed and deployed real time Tableau dashboards that identified items which are most/least liked by the customers using key performance metrics that aided retailer towards better customer centric assortments. It also aided retailer towards strategies pertaining to better ad placement, bundling and assortments
Confidential, Charlotte, NC
Jr. Data Scientist
Responsibilities:
- Forecasted bank-wide loan balances under normal and stressed macroeconomic scenarios using R. Performed variable reduction using the stepwise, lasso, and elastic net algorithms and tuned the models for accuracy using cross validation and grid search techniques.
- Automated the scraping and cleaning of data from various data sources in R and Python. Developed Banks' loss forecasting process using relevant forecasting and regression algorithms in R.
- The projected losses under stress conditions helped bank reserve enough funds per DFAST policies
- Built classification models using several features related to customer demographics, macroeconomic dynamics, historic loan payment behavior, type and size of loans, credit scores and loan to value ratios and with accuracy of 95% accuracy the model predicted the likelihood of default under various stressed conditions.
- Built credit risk scorecards and marketing response models using SQL and SAS. Evangelized the complex technical analysis into easily digestible reports for top executives in the bank.
- Developed several interactive dashboards in Tableau to visualize nearly 2 Terabytes of credit data by designing a scalable data cube structure.
Confidential, Austin, CA
Data Modeler/Data Analyst
Responsibilities:
- Analyzed large datasets to provide strategic direction to the company.
- Performed quantitative analysis of ad sales trends to recommend pricing decisions.
- Conducted cost and benefit analysis on new ideas. Scrutinized and tracked customer behavior to identify trends and unmet needs.
- Developed statistical models to forecast inventory and procurement cycles. Assisted in developing internal tools for data analysis.
- Designed scalable processes to collect, manipulate, present, and analyze large datasets in production ready environment, using Akamai's big data platform
- Achieved a broad spectrum of end results putting into action the ability to find, and interpret rich data sources, merge data sources together, ensure consistency of data-sets, create visualizations to aid in understanding data, build mathematical models using the data, present and communicate the data insights/findings to specialists and scientists in their team
- Implemented full lifecycle in Data Modeler/Data Analyst, Data warehouses and Data Mart's with Star Schemas, Snowflake Schemas, and SCD& Dimensional Modeling Erwin. Performed data mining on data using very complex SQL queries and discovered pattern and used extensive SQL for data profiling/analysis to provide guidance in building the data model
Confidential
Data Analyst
Responsibilities:
- Managed product development, bulk production execution (annual volume of approx. Confidential units) including order procurement and fulfillment, vendor negotiations, quality monitoring, for men, women and children apparel for various Confidential brands in their knits (tops and underwear) & sweaters product lines
- Expanded vendor base to increase competitive strength in sourcing cost, new product innovation and development to further the market share of knitted men’s and women’s wear of Banana Republic in India
- Successfully managed all facets of vendor management spanning samples and contract initiations and negotiations, cost effectiveness, training, production monitoring, delivery and quality control including Fast orders of basic styles
- Synchronized forecasted product demand and vendor production capacity and supply to promote local office role in the global supply chain management
- Conducted orientation presentation and training on internal systems and processes for new hires
- Generated various research reports for imperative Data analysis, to support financial goals and overall performance, spanning procurement cost savings reports cost analysis of high volume styles to identify increasing component costs year over year and seasonal trend analysis of business volumes to study business migration trends, changes and achievements vendor performance reports
