- Experience Data Scientist
- Experience with Hadoop / Big Data environments like Spark, Cloudera, Hortonworks Sandbox/Data Mining, Machine Learning & Artificial Intelligence.
- Experience with Scala programming language on spark
- Experience with cloud Computing on AWS, AZURE …etc.
- Holding B.S in Mathematics Major Actuarial & Statistics Science from Rutgers University
- Excellent communication and problem - solving skills.
- US Citizen, Local and Available for In-person interview.
- I am a detail oriented professional with over 5 years of experience as a Data Scientist. I am highly proficient working on Hadoop / Big Data environments like Spark, Cloudera, Horton work Sandbox platform over 4 years. I am well versed with Scala programming language on spark. I am holding B.S in Mathematics Major Actuarial & Statistics Science from Rutgers University. I am local and available for an in-person interview before 24-hour prior notice.
- Experienced, mid-level data scientist with newly acquired skills, an insatiable intellectual curiosity, and the ability to mine hidden gems located within large sets of structured, semi-structured, and unstructured data. Ability to leverage, mathematics, high-level programming, statistics with visualization, with a healthy sense of explanation. All Combine with:
- 5 years’ Experience in developing customer models (Classification, Scoring, Ranking, Probability estimate & Clustering) with R, H2o & Python on Big Data environment.
- 5 Years’ Experience in applying Data mining technique & Data science algorithm on Hadoop Ecosystem/Big Data environments to Public Health Care and Education problem that demonstrated a potential saving in resource due to the Capacities to predict an upcoming future event.
- 5 years’ Experience developing public health’s Risk Management and Stratification Model relying on, random forest, gradient boosting, logistic regression, and another Machine learning algorithm and platform.
- Robust R, Python and Scalar skills
- Experience and expertise in machine learning based predictive modeling projects with machine learning models like Gradient Boosting, Collaborative filtering, Bayesian Methods, Random Forest, SVM, Markov Models
- Five years’ experience with R, SAS, WEKA, H2o and PYTHON
- Tree Years’ Experience with Scala programming language on spark
- Five years’ experience within Hadoop ecosystem / Big Data environments (HIVE, PIG, No-SQL …etc.
- Five years’ experience creating building / delivering reporting dashboards with Tableau, R SAS and Alteryx
- Strong General programming skills and knowledge in C++, Java, MATLAB and Excel/VBA
- Spark, Cloudera, Hortonworks Sandbox experience
- Strong knowledge and expertise in applied statistics (Regression, Classification, Segmentation, Clustering, Association),Microeconomics (discrete choice modeling) and mathematical optimization
- Strong communication skills and marketing skill
- Over fitting, modeling, lasso and Ridge regression, cross validation, bootstrapping, bagging, randomization, boosting, multinormal logic regression, Knn, Kmean, deep neural network, TensorFlow, DoE, probit, hypothesis testing, ggplot2,
- Strong knowledge and expertise with Neural Network frameworks such as Caffe, Theano, TensorFlow, H2o …etc.
- Ability to work with different Machine Learning libraries/ frameworks for integration
BIG DATA PLATFORM/SQL
- Created AML POC for:
- Data Gathering Search and Evidence Automation
- Case Selection and Cognitive Automation/Micro-Decisioning
- Narrative Automation/Artificial Intelligence
- Design and implementation of Big Data platform
- Data creation and gathering automation
- Population classification and segmentation automation for better filtering and model creation
- Topic selection and Narrative automation
Team Member as Data engineer and Data scientist
- Created Web base applications for data entry and reports
- Converted data from HDFS to Structure into Databa based on client request(ETL).
- Converted Data set into actionable(modeling) to Predict and/or Analyzed spending & purchase habits, budget, population segmentation, population classification.
- Work stream stage Automations
SQL ETL/Inhouse Illustration platform/ Stata/Data Mining
- Providing estimates and prediction for the custom Portfolio Return suggest implementations for Profitability
- Development of tools for risk calculation base on portfolio past performance