Sr. Data Scientist Resume
Austin, TX
SUMMARY:
- Building scalable Big - Data Python/Spark; expert in Tensorflow framework
- Building intelligent applications with Machine learning models on console and on cloud.
- Deploying Machine Learning models with Docker containers and Apache Server.
TECHNICAL SKILLS:
Machine Learning:: Supervised/Unsupervised, Regression, Classification, Deep Learning, Ensemble
APIs: Tensorflow API, Keras, Scikit Learn, Pytorch, Spark MLlib, MapReduce
Programming:: R, Python, Scala
Database: MySQL, MS SQL, Haddoop File System, NoSQL (with MongoDB)
AWS Cloud:: S3, EMR, SageMaker, EC2, Rekognition
Environment: Jupyter Notebook, R Studio, Spark Framework, Apache Airflow, Chron Scheduler
Web Development based on: .NET ASP.NET MVC 5, C#, XML, CSS, AJAX, REST/SOAP WebServices
PROFESSIONAL EXPERIENCE:
Sr. Data Scientist
Confidential, AUSTIN, TX
- Utilized Tensorflow APIs to implement feature engineering: discretization, binning, data wrangling cleaning, transforming, merging and reshaping data frames prior to Deep Learning/ML predictions
- Leveraged the power of deep learning models (pre-trained models) and Natural Language Processing (NLP) to create sentiment analysis, text classification and recommendation based applications with Spark/Python
- Built fraud detection predictive models (using PySpark MLlib) trained on distributed system with array of ensemble models (Random Forest, Gradient Boost and XGBoost)
- Built Superior Plan recommendation systems for over 6 million committed users. Employed Neural Network Embeddings (and NLP Semantics) with Tensorflow framework (and on Spark clusters) to recommend the company’s similar products to their members.
Sr. Data Scientist
Confidential, Denver, CO
- Collected and synthesize business requirements to create effective machine learning use cases
- Led team to develop highly scalable tools using Machine Learning and Deep learning algorithms to predict health-related and financial use cases.
- Supervised deployment machine learning models in a production grade environment.
Data Scientist
Confidential, Marlborough, MA
- Built Machine Learning Solutions (with Microsoft Azure Cloud Environment) on clinical images that predict subjects who are at risk of cognitive impairment. Fine-tuned the trained model with variety of algorithms including Naïve Bayes models, Logistic Regression, Boosted Decision Tree and Random Forest
- Deployed the Trained Machine Learning Solutions through Batch Execution Web Services in Microsoft Stack (C# based ASP.NET MVC 5) and Python API Web Services.
Data Scientist
Confidential, Aurora, CO
- Leveraged machine learning and artificial intelligence tools to predict pediatric patients at-risk of arterial thrombosis based on their extracted clinical and demographic markers.
- Performed ad-hoc data preprocessing - outliers detection/removal, elimination of multi-colinearity with Principal Component Analysis (PCA)- with Python and R scripting
- Evaluated variety of algorithms including K-means clustering, Naïve Bayes, Logistic Regression, Principal Component Analysis, Decision Trees and Emsemble algorithms (Decision Trees) on Sci-kit Python packages, R and SAS environment.
- Provided detailed reports with variety of tools: MatPlotLib (Sci-kit), GGPlot2 and SAS Reports to explain classifications.
Sr. Analyst
Confidential
- Collaborated with internal talent acquisition managers to develop quantitative analysis that predict Key Performance Indicators of business processes based on available metrics
- Employed linear and multiple regression for financial forecasting
