We provide IT Staff Augmentation Services!

Sr.data Scientist /assistant Research Professor Resume

4.00/5 (Submit Your Rating)

PROFESSIONAL SUMMARY:

  • Expert in quantitative data analysis, predictive modeling, data management and interpretation.
  • Proficiency with many machine learning algorithms, statistics, physics and mathematics.
  • Strong experiences in data analytics, data engineering, data validation, and data mining.
  • Deep knowledge in big data technology, neural networks, Tensorflow, NPL.
  • Critical thinker, fast learner, self - motivated, able to work within deadline.

COMPUTER SKILLS:

  • Proficiency with Python (Numpy, Pandas, Sklearn), SQL, R, C/C++, Perl, Excel,
  • Strong experience with Unix/Linux shell scripts. Work in the cloud environment.
  • Experience in Hadoop, Spark, Mapreduce, Jupyter, Github, HTTP/Apache, SAS, Tableau.

PROFESSIONAL EXPERIENCE:

Sr.Data Scientist /Assistant Research Professor

Confidential

Responsibilities:

  • Applied many machine learning algorithms (such as Decision tree, Random forest, GBoost, k-NN, Naive Bayes, SVM, Logistic regression, neural network) for predictive modeling.
  • Applied regression algorithms to accurately predict protein/DNA data quality at the Confidential .
  • Applied many classification algorithms to predict HR disease and banking behavior.
  • Applied the Principle Component Analysis (PCA) to simplify visualization of biological data quality.
  • Applied non-linear regression to accurately predict data growth at the Confidential .
  • Applied multivariate linear regression for Marketing Mix Modeling (MMM).
  • Developed a python module to automatically select the best Machine Learning algorithm and the best Hyperparameters from the Scikit-Learn library for predictive modeling.
  • Design and developed software tools for various data cleaning, validation, and data mining.
  • Developed a new algorithm to detect anomalous data with high accuracy and performance.
  • Created a relational database (MySQL) and developed Python scripts to query data for statistical analysis.
  • Designed and developed the PDB Distro, a web-based statistical tool to calculate univariate data distribution probability, multivariate data correlation, and the outliers. Thus, provided insights into big data quality and usability.
  • Lead a team to develop a Drug Design Data Resource (HR data) pipeline in support of computer-aided drug design and discovery (collaborated with Novartis, Roche, Johnson & Johnson, Genentech).
  • Taught and instructed the bio-curators for detecting and correcting various data errors.

Senior software developer/Research Associate

Confidential

Responsibilities:

  • Designed and developed the PDB extract, a bioinformatic user-friendly software tool for unstructured data extraction, collection, integration, and format standardization. This tool is currently used by over 40,000 worldwide researchers.
  • Developed Confidential software and applied a statistical approach to validate the model against the experimental data, thus solving the problem of inaccurate data being used for drug design.
  • Developed the software RNAview to classify nucleic acid base pairs, RNA motifs and display of secondary structure with full hydrogen bond interactions.

We'd love your feedback!