We provide IT Staff Augmentation Services!

Data Scientist Resume

4.00/5 (Submit Your Rating)

Sunnyvale, CA

PROFESSIONAL SUMMARY:

  • Having 7 years of IT experience as an operations research analyst and data scientist,
  • Strong background and experience in data science/ data analysis, machine learning, probability and statistics theory,
  • Expert in mining, wrangling and analyzing complex, high - volume and high-dimensional data from varying resources,
  • Strong coding skills, performing data cleaning and data transformation activities using Python and R,
  • Experience with Big Data ecosystems,
  • Extensively involved in data preparation, exploratory data analysis, feature engineering using supervised/unsupervised modeling,
  • Solid grasp of Linear/Non-linear regression and classification modeling and predictive algorithms in machine learning,
  • Proficient in Regression, Classification and Clustering analysis by using machine learning algorithms,
  • Experienced Creating / Modeling Machine Learning systems with Neural Networks and Convolutional Neural Networks with TensorFlow and Keras
  • Highly motivated and self-starter with TEMPeffective communication and organizational skills,

TECHNICAL SKILLS:

Programming: Python, R, MySQL, PostgreSQL

Tools: NumPy, Pandas, SciPy, Sci-kitLearn, Pyspark, Statsmodel, TensorFlow, Keras, NLTK, Seaborn, Matplotlib, Plotly, Cufflinks, Choropleth, SQLAlchemny, PySpark, Hadoop, HDFS

Artificial Intelligence: Statistical Machine Learning Algorithms, Linear and Logistic Regression, Decision Tree, K-Means, k-NN, Support Vector Machine (SVM), Random Forest, XGBoost, CNN, OpenCV, YOLO, Faster R-CNN, RNN, LSTM, NLTK, Spacy

PROFESSIONAL EXPERIENCE:

Confidential

  • Built a CNN to classify images from scratch by using Keras and Tensorflow,
  • Used batch normalization method to develop more efficient model,
  • Applied data augmentation techniques to increase training data and create more robust CNN model,
  • Used dropout, L2 regularization to overcome over-fitting problem,
  • Performed transfer learning, using inception, vgg19, mobilenet and resnet50 model and compared teh model accuracy,
  • Developing YOLO, Mask R-CNN algorithms to object detection.

Data Scientist

Confidential, Sunnyvale, CA

  • Performed data wrangling, cleaning and preprocessing methods to prepare data for analysis,
  • Created new features to build more robust models,
  • Analyzed iCloud upload and download performances for photos, drive and backup services for various countries using problem-specific metrics,
  • Developed K-Modes, Gaussian Mixture Models (GMM) and Agglomerative Hierarchical Clustering models in order to group countries which has same pattern in regards to iCloud performance,
  • Analyzed iCloud customer’s behavior patterns and relationship between iCloud performance and customer engagement and also iCloud performance and revenue.

Confidential

  • Analyzed Confidential iCloud data using big data ecosystems such as Hadoop, Pyspark,
  • Performed data wrangling, cleaning and preprocessing methods to prepare data for analysis,
  • Explored new features by using feature engineering techniques,
  • Applied Kruskal-Wallis, Confidential, Chi-square statistical test to detect important features on iCloud Performances,
  • Prepared python scripts for iCloud photos, drive and backup services to deploy on Hadoop environment,
  • Coordinated with Dev-Ops team to automate statistical analysis daily base,
  • Triggered daily emails about iCloud performance in photos, backup, and drive services.

Data Scientist

Confidential, San Jose, CA

  • Developed machine learning and deep learning models to provide industrial solutions,
  • Organized and conducted machine learning pipeline, with a focus on data preparation, model training and optimization, successfully used visualization techniques to inform and give intuitions to stakeholders,
  • Performed exploratory analysis and feature engineering to fit best models on Python,
  • Discovered patterns, formulated and tested hypotheses, translated results into strategies which drive growth resulting in increased revenues and improved customer satisfaction,
  • Consulting companies and individuals for their data science problems,

Confidential

  • Explored teh aspects dat influence teh forecast of client subscription in depositing,
  • Feature engineered, handled missing values and created new features by combining existing ones,
  • Utilized Python matplotlib and seaborn libraries for visualization and exploratory data analysis,
  • Applied Confidential, chi-square statistical tests to determine teh predictive power and association amid teh features,
  • Selected teh most significant features to decrease complexity and processing time and increase robustness of machine learning models,
  • Created different models including Regression, Random Forest, XGBoost, SVM, and Neural Networks algorithms; compared model performances, and adjusted hyper-parameters to obtain improved outcomes.

Confidential

  • Preprocessed data by transforming feature, dealing with missing values, feature engineering, creating dummy variables,
  • Explored and visualized data set by using seaborn and matplotlib libraries,
  • Developed models with data analysis algorithms such as Logistic Regression, Support Vector Machine, kNN, Naïve Bayes Classifier, Decision Trees, Random Forest and Neural Network,
  • Applied Grid Search method for parametric algorithms to find optimum parameters and compared algorithms’performances,
  • Ranked top 7% in teh competition.

Confidential

  • Analyzed pattern between violence and other features such as gender, ethnicity, religion, age etc...,
  • Used categorical statistical techniques, like Chi-Sq test, CramersV, Confidential test, t-test to unveil relation between variables,
  • Applied advanced visualization techniques by using RShine,
  • Performed spatial analysis on data by using map applications,
  • Due to not finding pattern, used permutation test to increase sample size.

We'd love your feedback!