Analyst Resume
4.00/5 (Submit Your Rating)
Charlotte, NC
TECHNICAL SKILLS:
Domains: Banking, Insurance, Healthcare, Securities
Programming Languages: Java, C++, R, Scala (UDF), spark, spark mllib, hive, python, pyspark
Big data API: Spark, SparkR, Hadoop, map reduce, oozie workflow
Data Warehouse Reporting & Integration Tools: SSIS, SSRS
Databases: HDFS, MySQL, SQL, DB2, Oracle
ML models worked on: Random Forest, Decision tree, KNN, word2Vec, Neural network, SVM, Logistic regression, time series regression model, Information Retrieval, Deep learning with nlp, numpy,pandas, scikit
Data visualization: gglpot, seaborn, matplotlib
Scripting: scala, python, R
PROFESSIONAL EXPERIENCE:
Analyst
Confidential, Charlotte, NC
Responsibilities:- Involved in creating machine learning models to find anomalies in data - numerical in spark ML
- Automate classification of data transformation types using machine learning using - decision tree classifier. Created spark scala script with UDFs that extract required features form data and feed it to decision tree classifier to train the model and use it for automating data transformation classification in capturing data lineage for audits
- Created linear regression models in spark ML that learns the trend in key amount fields of given file and use it to predict future values. Metrics MSE, RMSE, R squared deployed to evaluate the prediction model
- Classifying customers as bad and good customers based on features like FICO score, balance - KNN, decision tree classifier, One vs all, Multilayer perceptron, Logistic regression, Random Forest in spark ML. Metrics - F1, precision, recall are used to evaluate the model accuracy.
- Created time series regression prediction model to predict customer balance amount using holtwinter and ARIMA time series in R studio. Data used in research are from hdfs
Graduate Research
Confidential, Clemson, South Carolina
Responsibilities:- Involved in installation, deployment and performance evaluation of interface SparkR
- Have successfully tested statistical computation in spark. Fine tuning parameters set up in SparkR
