Data Scientist Resume
SUMMARY:
- 7+ years of Data Science experience in architecting and building comprehensive analytical solutions in Marketing, Sales and Operations functions across Technology, Retail and Banking industries. Worked closely with functional team leaders (in Product, Operations, Marketing, etc.) to explain analysis, findings, and recommendations.
- Strong track record of contributing to successful end - to-end analytic solutions (clarifying business objectives and hypotheses, communicating project deliverables and timelines, and informing action based on findings).
- Expertise writing production quality code in SQL, R, Python and Spark. Hands on experience building regression and classification models and other unsupervised learning algorithms with large datasets in distributed systems and resource constrained environments.
- Expert knowledge in supervised and unsupervised learning algorithms such as Ensemble Methods (Random forests), Logistic Regression, Regularized Linear Regression, SVMs, Deep Neural Networks, Extreme Gradient Boosting, Decision Trees, KMeans, Gaussian Mixture Models, Hierarchical models, and time series models (ARIMA,GARCH, VARCH etc.).
- Led independent research and experimentation of new methodologies to discover insights, improvements for problems. Delivered findings and actionable results to management team through data visualization, presentation, or training sessions. Proactively involved in roadmap discussions, data science initiatives and the optimal approach to apply the underlying algorithms.
- Hands on experience communicating business insights by dashboarding in Tableau. Developed automated tableau dashboards that helped evaluate and evolve existing user data strategies, which include user metrics, measurement frameworks, and methods to measurement. Also developed and deployed dashboards in Tableau and RShiny to identify trends and opportunities, surface actionable insights, and help teams set goals, forecasts and prioritization of initiatives.
- Experience building interpretable machine learning models, and building end to end data pipelines which included extracting, transforming and combine all incoming data with the goal of discovering hidden insight, with an eye to improve business processes, address business problems or result in cost savings.
- Experience working with large data and metadata sources ; interpret and communicate insights and findings from analysis and experiments to both technical and non-technical audiences in ad, service, and business.
- Experienced in Data Modeling retaining concepts of RDBMS, Logical and Physical Data Modeling until 3NormalForm (3NF) and Multidimensional Data Modeling Schema (Star schema, Snow-Flake Modeling, Facts and dimensions). Hands on experience in optimizing the SQL Queries and database performance tuning in Oracle, SQL Server and Teradata databases.
TECHNICAL SKILLS:
Exploratory Data Analysis: Univariate/Multivariate Outlier detection, Missing value imputation, Histograms/Density estimation, EDA in Tableau
Supervised Learning: Linear/Logistic Regression, Lasso, Ridge, Elastic Nets, Decision Trees, Ensemble Methods, Random Forests, Support Vector Machines, Gradient Boosting, XGB, Deep Neural Networks, Bayesian Learning
Unsupervised Learning: Principal Component Analysis, Association Rules, Factor Analysis, K-Means, Hierarchical Clustering, Gaussian Mixture Models, Market Basket Analysis, Collaborative Filtering and Low Rank Matrix Factorization
Feature Engineering: Stepwise, Recursive Feature Elimination, Relative Importance, Filter Methods, Wrapper Methods and Embedded Methods
Statistical Tests: T Test, Chi-Square tests, Stationarity tests, Auto Correlation tests, Normality tests, Residual diagnostics, Partial dependence plots and Anova
Sampling Methods: Bootstrap sampling methods and Stratified sampling
Model Tuning/Selection: Cross Validation, AUC, Precision/Recall, Walk Forward Estimation, AIC/BIC Criterions, Grid Search and Regularization
Time Series: ARIMA, Holt winters, Exponential smoothing, Bayesian structural time series
Python: pandas, numpy, scikit-learn, scipy, statsmodels, matplotlib, tensorflow
SAS: Forecast server, SAS Procedures and Data Steps
Spark: MLlib, GraphX
SQL: Subqueries, joins, DDL/DML statements
Databases/ETL/Query: Teradata, SQL Server, Postgres and Hadoop (MapReduce); SQL, Hive, Pig and Alteryx
Visualization: Tableau, ggplot2 and RShiny
Prototyping: PowerPoint, RShiny and Tableau
PROFESSIONAL EXPERIENCE:
Confidential , CA
Data Scientist
- Played key role in developing and deploying DFAST Stress Test models across several bank portfolios. Provided architectural leadership on several high priority initiatives including account prioritization, account prospecting, and opportunity scoring. Drove the creation of comprehensive datasets encompassing user profiles and behaviors, and incorporating a wide variety of signals and data types.
- Forecasted bank-wide loan balances under normal and stressed macroeconomic scenarios using R. Performed variable reduction using the stepwise, lasso, and elastic net algorithms and tuned the models for accuracy using cross validation and grid search techniques.
Confidential, Palo Alto, CA
Data Scientist
- Built machine learning based regression models using scikit-learn python frameworks to estimate the customer propensity to purchase based on attributes such as customer verticals they operate in, revenue, historic purchases, frequency and recency behaviors. These predictions helped estimate propensities with higher accuracy improving the overall productivity of sales teams by accurately targeting the prospective clients.
Confidential, Palo Alto, CA
Jr. Data Scientist
- Played a key role in developing and maintaining statistical and machine learning models that mine, analyze and turn VMware Customer and sales data into insights that helped Apple make strategic decisions that led to growth in their user base and revenue
Confidential
Data Modeler/Data Analyst
- Analyzed large datasets to provide strategic direction to the company. Performed quantitative analysis of ad sales trends to recommend pricing decisions.
- Conducted cost and benefit analysis on new ideas. Scrutinized and tracked customer behavior to identify trends and unmet needs.
- Developed statistical models to forecast inventory and procurement cycles. Assisted in developing internal tools for data analysis.
- Designed scalable processes to collect, manipulate, present, and analyze large datasets in production ready environment, using Akamai's big data platform.
- Achieved a broad spectrum of end results putting into action the ability to find, and interpret rich data sources, merge data sources together, ensure consistency of data-sets, create visualizations to aid in understanding data, build mathematical models using the data, present and communicate the data insights/findings to specialists and scientists in their team.
- Implemented full lifecycle in Data Modeler/Data Analyst, Data warehouses and DataMart’s with Star Schemas, Snowflake Schemas, and SCD& Dimensional Modeling Erwin. Performed data mining on data using very complex SQL queries and discovered pattern and used extensive SQL for data profiling/analysis to provide guidance in building the data model.
