Sr.Data Scientist /Assistant Research Professor Resume

PROFESSIONAL SUMMARY:

Expert in quantitative data analysis, predictive modeling, data management and interpretation.
Proficiency with many machine learning algorithms, statistics, physics and mathematics.
Strong experiences in data analytics, data engineering, data validation, and data mining.
Deep knowledge in big data technology, neural networks, Tensorflow, NPL.
Critical thinker, fast learner, self - motivated, able to work within deadline.

COMPUTER SKILLS:

Proficiency with Python (Numpy, Pandas, Sklearn), SQL, R, C/C++, Perl, Excel,
Strong experience with Unix/Linux shell scripts. Work in the cloud environment.
Experience in Hadoop, Spark, Mapreduce, Jupyter, Github, HTTP/Apache, SAS, Tableau.

PROFESSIONAL EXPERIENCE:

Sr.Data Scientist /Assistant Research Professor

Confidential

Responsibilities:

Applied many machine learning algorithms (such as Decision tree, Random forest, GBoost, k-NN, Naive Bayes, SVM, Logistic regression, neural network) for predictive modeling.
Applied regression algorithms to accurately predict protein/DNA data quality at the Confidential .
Applied many classification algorithms to predict HR disease and banking behavior.
Applied the Principle Component Analysis (PCA) to simplify visualization of biological data quality.
Applied non-linear regression to accurately predict data growth at the Confidential .
Applied multivariate linear regression for Marketing Mix Modeling (MMM).
Developed a python module to automatically select the best Machine Learning algorithm and the best Hyperparameters from the Scikit-Learn library for predictive modeling.
Design and developed software tools for various data cleaning, validation, and data mining.
Developed a new algorithm to detect anomalous data with high accuracy and performance.
Created a relational database (MySQL) and developed Python scripts to query data for statistical analysis.
Designed and developed the PDB Distro, a web-based statistical tool to calculate univariate data distribution probability, multivariate data correlation, and the outliers. Thus, provided insights into big data quality and usability.
Lead a team to develop a Drug Design Data Resource (HR data) pipeline in support of computer-aided drug design and discovery (collaborated with Novartis, Roche, Johnson & Johnson, Genentech).
Taught and instructed the bio-curators for detecting and correcting various data errors.

Senior software developer/Research Associate

Confidential

Responsibilities:

Designed and developed the PDB extract, a bioinformatic user-friendly software tool for unstructured data extraction, collection, integration, and format standardization. This tool is currently used by over 40,000 worldwide researchers.
Developed Confidential software and applied a statistical approach to validate the model against the experimental data, thus solving the problem of inaccurate data being used for drug design.
Developed the software RNAview to classify nucleic acid base pairs, RNA motifs and display of secondary structure with full hydrogen bond interactions.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship