- 7+ years of experience in Data Science with expertise in Descriptive, Inquisitive, Predictive and Prescriptive analytics.
- Experience working with senior stake holders to understand the business requirements and present actionable data insights to Senior management.
- Experience working with large data sources (5Bn rows+); interpret and communicate insights and findings from analysis and experiments to both technical and non - technical audiences.
- Strong computational background (complimented by Statistics/Math/Algorithmic expertise), healthy portfolio of projects dealing with Big Data, solid understanding of machine learning algorithms, and with a love for finding meaning in multiple imperfect, mixed, varied, and inconsistent data sets.
Statistics/Machine Learning: Univariate/Multivariate regression, Lasso, Ridge, Decision trees, Ensemble methods - Random forests, Gradient Boosting, Deep neural networks, ANOVA, Supervised learning, Unsupervised learning, Principal component analysis, Factor analysis, Bootstrap sampling methods, K-Means, Hierarchical clustering, Gaussian mixture models, Bayesian learning, Market basket analysis, Time series forecasting (ARIMA, Holt Winters and Exponential smoothing), Survival analysis, Feature selection and Linear programming, Recommender systems - collaborative filtering (user based, item based), Low rank matrix factorization
Python: pandas, numpy, scikit-learn, scipy, statsmodels, ggplot2
R: caret, glmnet, forecast, xgboost, rpart, survival, arules, sqldf, dplyr, nloptr, lpSolve, ggplot
SAS: Forecast server, SAS Procedures and Data Steps
Other: SPSS, Alteryx, Knime and Weka
Databases/ETL/Query: Teradata, SQL Server, Postgres and Hadoop; SQL, Hive, Pig and Alteryx
Visualization: Tableau, ggplot2 and RShiny
Prototyping/POC/POV: RShiny, Tableau, Balsamiq and PowerPoint
Confidential, Bohemia, NY
- Developed classification machine learning models in python that predicted purchase propensity of customers based on customer attributes such as demographics - education, income, age, geography, historic purchases and other related attributes. Predicting customer propensity helped marketing teams to aggressively pursue prospective customers.
- Developed classification models to predict the likelihood of customer churn based on customer attributes like customer size, revenue, type of industry, competitor products and growth rates etc.
- The models deployed in production environment helped detect churn in advance and aided sales/marketing teams plan for various retention strategies like price discounts, custom licensing plans etc.
- Projected customer lifetime values based on historic customer usage and churn rates using survival models.
- Understanding customer lifetime values helped business to establish strategies to selectively attract customers who tend to be more profitable for Confidential .
- It also helped business to establish appropriate marketing strategies based on customer values.
- Developed 11 customer segments using unsupervised learning techniques like KMeans and Gaussian mixture models.
- The clusters helped business simplify complex patterns to manageable set of 11 patterns that helped set strategic and tactical objectives pertaining to customer retention, acquisition, spend and loyalty.
- Improved sales/demand forecast accuracy by 20-25% by implementing advanced forecasting algorithms that were effective in detecting seasonality and trends in the patterns in addition to incorporating exogenous covariates. Increased accuracy helped business plan better with respect to budgeting and sales and operations planning.
- Implemented market basket algorithms from transactional data, which helped identify products ordered together frequently.
- Discovering frequent product sets helped unearth Cross sell and Upselling opportunities and led to better pricing, bundling and promotion strategies for sales and marketing teams.
- Measured the price elasticity for products that experienced price cuts and promotions using regression methods; based on the elasticity, Confidential made selective and cautious price cuts for some of its product segments.
Confidential, Austin, TX
- Developed a personalized coupon recommender system using recommender algorithms (collaborative filtering, low rank matrix factorization) in python that recommended best offers to a user based on similar user profiles.
- The recommendations enabled users to engage better and helped improving the overall user retention rates.
- Developed a lead scoring system by modeling the users based on company size, industry segment, job title or geographic location using supervised learning algorithms.
- Scoring leads led to increased sales efficiency and effectiveness, increased marketing effectiveness and tighter marketing and sales alignment.
- Designed theDataWarehouse and MDM hub Conceptual, Logical and Physicaldatamodels
- Used Normalization methods up to 3NF and De-normalization techniques for effective performance in OLTP and OLAP systems.
- Generated DDL scripts using Forward Engineering techniques to create objects and deploy them into the database.
Confidential, Charlotte, NC
- Played key role in developing and deploying Confidential Test models across several bank portfolios.
- Provided architectural leadership on several high priority initiatives including account prioritization, account prospecting, and opportunity scoring.
- Drove the creation of comprehensive datasets encompassing user profiles and behaviors, and incorporating a wide variety of signals and data types.
- Automated the scraping and cleaning of data from various data sources in R and Python.
- Developed Banks’s loss forecasting process using relevant forecasting and regression algorithms in R.
- The projected losses under stress conditions helped bank reserve enough funds per DFAST policies.
- Developed several interactive dashboards in Tableau to visualize 8 billion rows credit data by designing a scalable data cube structure.
- Built credit risk scorecards and marketing response models using SQL and SAS.
- Evangelized the complex technical analysis into easily digestible reports for top executives in the bank.
Data Modeler/Data Analyst
- Designed scalable processes to collect, manipulate, present, and analyze large datasets in a production ready environment, using Akamai's big data platform.
- Achieved a broad spectrum of end results putting into action the ability to find, and interpret rich data sources, merge data sources together, ensure consistency of data-sets, create visualizations to aid in understanding data, build mathematical models using the data, present and communicate the data insights/findings to specialists and scientists in their team.
- Implemented full lifecycle inDataModeler/DataAnalyst,Datawarehouses and DataMart.
- Performeddatamining ondatausing very complex SQL queries and discovered pattern and used extensive SQL fordataprofiling/analysis to provide guidance in building thedatamodel.
Data Analyst /Data Modeler
- Worked with SME's and other stakeholders to determine the requirements to identify Entities and Attributes to build Conceptual, Logical and PhysicaldataModels.
- Used Star Schema methodologies in building and designing the logicaldatamodel into Dimensional Modelsextensively.
- Developed Star and Snowflake schemas based dimensional model to develop thedatawarehouse.
- Designed Context Flow Diagrams, Structure Chart and ER- diagrams.