- I have full time job experience in and US IT - Data fields and shanghai commodity market
- I am an expert level SAS & SQL Programmer concentrate on data preprocessing, regression, logistic regression, DT, clustering programming, also I know visualization using R studio, SAS and tableau, concentrate on domain knowledge of AML, fraud detection, CCAR and credit risk and risk analyst.
- Detail-oriented, focus on detail and Data-Driven candidates
- We created a Database based on Cabi(similar to Uber service) to solve the business problem of exciting business system
- Created The Entity relationship diagram includes 12 entity types and Collected data from different users and inserted data into database.
- Used SQL to select data with different conditions to improve the Database Model.
- Weather condition affect airplane delay analysis (using SAS Enterprise Miner)
- Use Regression, Neural network, decision tree, regression stepwise models to analysis the dataset
- Evaluate the result by ROC analysis, misclassification rate, cumulative lift and confusion Matrix
- Create report about air pollution condition such as sulfate and nitrate (using R software)
- Write a function named 'pollutantmean' that calculates the mean of a pollutant (sulfate or nitrate) across a specified list of monitors.Write a function that reads a directory full of files and reports the number of completely observed cases in each data file.
- Write a function that takes a directory of data files and a threshold for complete cases and calculates the correlation between sulfate and nitrate for monitor locations where the number of completely observed cases (on all variables) is greater than the threshold, and represent it in power points file.
- Create data visualization report using Tableau
- Create Scatterplot, Bar Chart and Histogram by adding the same values for the shelves and cards as in the graph
- Using the same data as provided for the first three worksheets, create a fourth worksheet in the same Tableau workbook for a bar chart displaying the greenhouseGases2012 value for Australia, Canada, Ireland, and Japan. Then enhance the bar chart in any manner you want to choose.
- Using the UNdata.xlsx used, identify an opportunity to improve HDI based upon its relationship with other factors represented in the worksheet. also provide a Word document accompanying the Tableau worksheet that explains your choices for the graph and and what they graph conveys.
- Use SAS programming to analysis the dataset of customer buy books between Amazon and Barnes noble
- Using SAS/SQL extract customer’s Demographic information such as education, region, hhsz, age, income, child, race, country, domain, date, type of product, quality, price etc.
- To check whether those variables have interaction effect such as age*income, try many interaction effects we think are likely, also we construct some variables by owe analysis (ex. weekdays/weekend, or seasons, degree of loyalty to certain vendor) using SAS/base to construct the new dataset.
- Use SAS MLE function and Run the Poisson Regression model or NBD regression using the customer characteristics dataset we created, and evaluate which model is more fit by checking log-likelihood, also we create count table using SAS/SQL for Poisson Regression model. Some variables can be takeaway by checking t value/significant higher than 0.05, also we interpret some statistics indexes such as Reach, average frequency and Gross rating points(GRPs)
- Eliminates redundant variables in dataset using likelihood ratio test and chi square table
- To predict whether customer will purchase from certain vendor using machine learning model (logistic regression) from SAS/STAT package, we monitor 40K transactions in our company’s records and partition the dataset into validating/testing data and create the confusing matrix to evaluate logistic regression’s performance, interpret odd ratio or other indexes.
Data Analytics& Statistical Analysis: SAS, SPSS, SAS Enterprise Miner, SAS Enterprise Guide, R, SSIS/SSRS, Excel (VLOOKUP, pivot tables)
Programming languages: SAS/BASE, SAS/SQL, SAS/STAT, SAS/Macro, SAS/ODS, SAS/Graph, SQL, JAVA (basics only)
Data visualization: Tableau, R(qplot,ggplot), SAS (ODS, SQPLOT) Python(seaborn)
Data mining techniques: Data Partitions and sampling, impute the dataset (mean, median, multiple imputation) dimension reduce(PCA, Variable selection, VIF for multicollinearity) classifiers (regression, logistic regression, cluster, decision tree, random frost, bagging, boosting, neural network, support& confidence, SVM,HP model ensembles)Anova, Bayesian statistics Evaluation (confusion matrix, Lift chart, ROC, loglikelihood, Kolmogorov-Smirnov statistic)
Environment: Unix, Windows, Teradata (Architecture, Data Manipulation, make queries, create views, create macros, OLAP functions, user management)
Others: Microsoft Excel, Microsoft Office access, Microsoft Office Visio, VBA Pivot tables/Charts
Confidential, Piscataway, NJ
CCAR and Credit risk analyst
- My main responsibility is that create master dataset for CCAR model such as PD model executing and forecasting
- Based on the variable in dataset, create a new variable define that loan s quality using different logic and Algorithm such as create the new variable account fraud rate for each account the logic is total times of fraud numbers divide by total times of transaction of each account insert the monthly and daily new retail loan record into the dataset use SAS/SQL update the information such as SSN, address of our customer for bank information error or customer requirements merge two datasets by common key for furfure risk analysis
- Extract the variables from macro - economy dataset prepare the dataset for CCAR model such as PD EAD LGD model execution and forecasting use SAS/Macro
- Find some low-grade loan and identify the customer s background for BA team problem identification analysis and risk analysis
- Find some long collection period loan such as Mortgage, Auto-loan, Corporate-loan and identify the customer s background for BA team problem identification analysis
- Based on the variable in dataset, create a new variable define that loan s quality using different logic and Algorithm such as create the new variable account fraud rate for each account the logic is total times of fraud numbers divide by total times of transaction of each account
- Insert the monthly and daily new retail loan record into the dataset use SAS/SQL update the information such as address of our customer for bank information error or customer requirements
- Merge two datasets by common key for stress testing and CCAR model analysis
- Macro-economic analyst focus on crude oil and metal
- Analysis the economic data such as API, EIA, GDP, PPI, FOMC and predict the trend of commodities
- Give investor daily and weekly investment strategy
- Summarize other analyst’ profit weekly using excel
- Analysis and summaries daily economic information and data and release report on company’s website
- Insert some tableau graphs for analysis the relationship between the interest rate of US and month, and other economic indexes attaching to daily report
- Attend every day morning meeting of discussing the stock trend
- Writing stock analysis report using fundamental analysis and technique analysis and release stock market information to investors
- Attracting capital into stock market
- Register Caitong stock account for customers