- A Confidential . in statistics and 10+ years experience of statistical modeling, data mining, SAS/R programming and management/project lead in several industries including Healthcare insurance, P&C insurance, marketing, risk management, and Confidential .
- Proven track records of high quality project delivery and accountability. Good communication, time/priority management and interpersonal skills.
- Offer innovative and actionable business solutions.
- Generalized Linear Model ( Confidential ), Logistic regression, cluster analysis, CHAID (decision tree), survival (hazard) model/analysis, Time - series ( Confidential ), discrete choice, factor analysis (principal component, common factor), market basket analysis, neural network, Bayesian method/network. Supervised/unsupervised learning
- R, Confidential, regression, neural network, RandomForest, GBM, clustering, SAS (Base, Stat, ETS, Miner, Macro, Proc SQL, IML graphic etc.), SQL (query, procedure), SPSS
- Data mining tools (R,SAS Enterprise Miner), Excel, PowerPoint. System: Unix (Solaris, hp), Linux, Window, Hadoop, Hive, H2O,Teradata, Oracle and Mainframe(MVS TSO)
Senior Data Scientist, Risk Management
- Utilized the models on cross-sell other health coverage products.
- Implemented the models to the book of business through scoring and reporting. Performed data manipulation, data mining in SAS and R. Presented findings to senior management and business clients.
- For underwriting and pricing, developed predictive models to predict total medical claim cost at individual and group level using demographic, prior medical history, condition, health habit score etc. Identify risk factors other than those used in pricing to improve underwriting profit.
- Evaluated health management program’s impact on cost saving using propensity score matching methodology.
- Acted as a thought leader on data mining and predictive modeling method, reviewed and validated model result and assumptions. Evaluated third party software such as Bayesian network.
- Consulted internal clients on forecasting drug sale/utilization given shock events such as patent expiration, launch of generic drugs etc.
Sr. Consultant/Statistical Modeler
- Leading and developing predictive models for non-standard auto line.
- Applied data mining methods including decision tree, cluster analysis to enhance existing model performance.
- Mentored new team members.
- Developed sale and acquisition models for marketing.
- The model is used to identify major contributors (pricing change, sale channel, advertisement channel, demographic segmentation) to incremental sale of insurance policy.
Sr. Predictive Modeler/Manager
- Developed predictive model to identify fraudulent claims in body injury and physical damage auto coverage.
- Data sources include vehicle damage, driver/passenger injury, medical diagnosis, treatment and billing, prior claim, relationship with other claims under investigation etc.
- Applied logistic regression, link analysis, principal component and decision tree to identify high risk fraudulent claims, unusual patterns.
- Developed and tested holistic scores to detect abnormal claim/claimant when target variable (i.e. fraudulent claim) is not clearly defined/available.
- Coordinated with client (claim department) on data gathering, methodology and implementation Modeled ( Confidential ) pure premium, claim, severity, and loss using policy level attributes such as tier, class, IS score, EGR etc.
- Designed and validated the statistical modeling concept (i.e. equivalency of different combination of dependent variable, exposure weighting and offset) and developed optimal modeling measurement criteria such as two-way lift curve, spread lift curve, mean square error, segment goodness of fit test.
- Developed survival model to predict policy lifetime value. Using decision tree (with weight and without weight) to identify high/low loss ratio segments, rating plan change impact, and interaction segments of major rating factors.
- Explore the association between loss and exposure to validate and modify multiplicative modeling assumption.
- Developed territory cluster using clustering analysis for loss forecasting.
- Wrote SAS programs to retrieve, aggregate, manipulate large dataset (30+ million records), developed SAS Proc Genmod code to implement Confidential modeling under various assumptions (i.e. tweedie, Gamma, Poisson distribution etc,).
- Compared different model performances via two-way lift charts, residual analysis etc.
- Developed models for different coverage (property damage, body injury, collision and comprehensive).
- Researched and recommended factor interaction and correlation, algorithm convergence.
- Presented modeling equivalent proof based on adjust exposure under different distribution assumptions.
- Managed offshore consultants on Most Valuable Customer Analysis and Data Quality control.
- Acted as technical lead in the modeling team to resolve key issues.
- Working with pricing team to implement model rating plan.
- Examine the impact from indicated to selected rating factors.
- Presented modeling results to senior management.
- Participated in model implementation process (to IT system).
- Managed technical consultants/contractor to assign project, ensure project quality, monitor progress/timeline, and interpret business requirement.
Manager, Marketing Information
- Developed multi-dimensional Confidential model to forecast gross revenues/sales trends by using their history data and other related time-series such as interest rate, prices, volume etc.
- Forecast new product sales over short and long term.
- Helped with model verification of PepsiCo International’s sale forecast ( Confidential ) model.
- Developed predictive model to predict sales using panel data.
- Analyzed Confidential data and modeled out of stock event using generalized linear model method and cluster analysis.
- Also managed project design, objective, deliverable scope, process flow and timeline, identified the appropriate modeling methodology and concluded summary of findings and action.
Confidential, Chicago, IL
- Developed statistical models to predict net benefit of credit line reduction to high-risk accounts.
- Developed time-series model ( Confidential ) to forecast the trend of account balance using account utilization rate, credit score trend etc.
- Present the result to senior management and user team.
- Managed the project from end to end, and delivered the project on time with expected quality.
- Developed portfolio risk models to predict short term and long term delinquency/charge-off.
- The models were developed using account status, transaction history and credit bureau attributes.
- Provided segmentation analysis through combination of business requirement and un-supervised learning approach.
- Segmentation was also conducted through decision tree and principal component analysis.
- Made actionable suggestions.
- Managed resources and timeline to deliver the project on time and meet expectation.
- Methodology: decision tree, cluster analysis, principal analysis, and multinomial logistic regression. account gross activity; 2) sale amount; 3) balance transfer; 4) good open balance; 5) delinquency 60 days+; 5) net income in year 1 & 2.;
- Used the combination of operations research (SAS Proc NLP) and statistical predictive modeling to derive optimal marketing Confidential offers when constraints (approval rate, minimum sale, fico score etc) are present. Each prospect receives one of the offers based on the maximum likelihood to respond