We provide IT Staff Augmentation Services!

Data Scientist Resume

3.00/5 (Submit Your Rating)

Auburn Hills, MI

SUMMARY

  • Data scientist with 7years of experience in transforming business requirements into actionable data models, prediction models and informative reporting solutions.
  • Expert in the entire Data Science process life cycle including Data Acquisition, Data Preparation, Data Manipulation, Feature Engineering, Machine Learning Algorithms, Validation and Visualization.
  • Experience in Machine Learning, Mathematical Modeling and Operations Research. Comfortable with R, Python, MATLAB.
  • Hands - on experience in Machine Learning algorithms such as Linear Regression, Logistic Regression, GLM, CART, SVM, KNN, LDA/QDA, Naive Bayes, Random Forest, XG Boost, Deep Learning etc.
  • Experienced in Artificial Neural Networks, CNN, RNN (LSTM, GRU)
  • Experience with Natural language techniques such as Tokenization, Lemmatization, Stemming, Count Vectorization, TF-IDF Vectorization, and Word2Vec.
  • Proficient in Python and its libraries such as NumPy, Pandas, Scikit-learn, Matplotlib and Seaborn,
  • Experience in working with deep learning framework Tensorflow, Keras and Pytorch
  • Experience working on Integrated Development Environments (IDE) like PyCharm, Sublime Text and Eclipse.
  • Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep learning.
  • Proficiency in R (e.g. ggplot2, cluster, dplyr, caret), Python (e.g. pandas, Keras, Pytorch, NumPy, scikit-learn, bokeh, nltk), Spark - MLlib, H20, or other statistical tools.
  • Worked on integration of diverse mathematical and statistical procedures, pattern recognition, model building
  • Knowledge and experience in agile environments such as Scrum and version control tools such as GitHub/Git.
  • Collaborated with data engineers to implement ETL process, wrote and optimized SQL queries to perform data extraction from Cloud and merging from Oracle.
  • Experienced in Big Data with Hadoop, HDFS, Map Reduce, and Spark.
  • Hands-on experience in importing and exporting data using Relational Database including MySQL and MS SQL Server, and NoSQL database like MongoDB.
  • Good team player and quick learner; highly self-motivated person with good communication and interpersonal skills.

TECHNICAL SKILLS

Programming Languages: Python, R, C, C++, MATLAB

ML Algorithms: Linear Regression, Logistic regression, Principal Component Analysis (PCA), K-means, Random Forest, Decision Trees, SVM, K-NN, Deep learning (CNN, RNN) and Ensemble methods

Python Packages: Scikit Learn, NumPy, Pandas, Keras, NLTK, Matplotlib, Seaborn, Scipy

Deep Learning Framework: Tensor Flow, NLP, Keras, Pytorch

Big Data Ecosystems: Hadoop, Spark

Database Systems: SQL, MongoDB

Operating System: Linux, Windows, Unix

PROFESSIONAL EXPERIENCE

Confidential - Auburn Hills, MI

Data Scientist

Responsibilities:

  • Responsible for data identification, collection, exploration, and cleaning for modeling, participate in model development for buyback program.
  • Develop models using Random forests, XGboost.
  • Used Pandas, Numpy, Scipy, Matplotlib, Sci-kit-learn and NLTK in Python for developing various machine learning algorithms.
  • Tuned the models using Machine learning algorithms Bayes point, logistic regression, decision tree and neural network models for good accuracy and deploy prediction models and test on the test data.
  • Visualize, interpret, report findings, and develop strategic uses of data by python Libraries like Numpy, Scikit-learn, Matplotlib, Seaborn.
  • Used Python 3.X (NumPy, SciPy, pandas, Scikit-learn, seaborn) and R (caret, trees, arules) to develop variety of models and algorithms for analytic purposes.
  • Performed Data Cleaning, features scaling, features engineering.
  • Missing value treatment, outlier capping and anomalies treatment using statistical methods, deriving customized key metrics.
  • Performed analysis using industry leading text mining, data mining, and analytical tools and open source software.
  • Applied natural language processing (NLP) methods to data to extract structured information.
  • Implemented deep learning algorithms such as Artificial Neural network (ANN) and Recurrent Neural Network (RNN), tuned hyper-parameter and improved models with Python packages TensorFlow.
  • Evaluated models using Cross Validation, ROC curves and used AUC for feature selection.
  • Dummy variables where created for certain datasets to into the regression.
  • Creating data pipelines using big data technologies like Hadoop, spark etc.

Environment: Python, R, Deep Learning, NLP, TensorFlow, Machine Learning, ROC, Hadoop, AUC, SQL, Spark, MongoDB

Confidential - Chicago, IL

Data Scientist

Responsibilities:

  • Participated in all phases of data mining, data collection, data cleaning, developing models, validation, and visualization to deliver data science solutions
  • Gathering Requirements from FIU (Financial Intelligence Unit) and Regulators.
  • Developed advanced analytical models and computational solutions using large-scale data manipulation and transformation, statistical analysis, machine learning, visualization.
  • Generate reports to meet regulatory requirements.
  • Worked on data cleaning and ensured data quality, consistency, integrity using Pandas, NumPy.
  • Experience in Deep Learning frameworks like MXNet, Caffe 2, Tensorflow, Theano, CNTK, and Keras to help our customers build DL models.
  • Creating statistical models using distributed and standalone models to build various diagnostics, predictive and prescriptive solution.
  • Performed data imputation using Scikit-learn package in Python.
  • Used RMSE/MSE to evaluate different models' performance.
  • Create different charts such as Heat maps, Bar charts, Line charts etc.

Environment: Python, Numpy, Pandas, Predictive models, Scikit-learn

Confidential

Data Scientist

Responsibilities:

  • Collaborated with data engineers and operation team to implement ETL process, wrote and optimized SQL queries to perform data extraction to fit the analytical requirements.
  • Experimented and built predictive models including ensemble methods such as Gradient boosting trees and Neural Network to predict Sales amount.
  • Explored and analyzed the customer specific features by using Matplotlib, Seaborn in Python and dashboards in Tableau.
  • Participated in features engineering such as feature generating, PCA, feature normalization and label encoding with Scikit-learn preprocessing.
  • Used RMSE/MSE to evaluate different models' performance.
  • Worked closely with internal stakeholders such as business teams, product managers, engineering teams, and partner teams
  • Designed database solution for applications, including all required database design components and artifacts.

Environment: Python, Matplotlib, Seaborn, Tableau, Numpy, Pandas, ETL, SQL, Predictive models, Scikit-learn

Confidential

Data Scientist

Responsibilities:

  • Responsible for reporting of findings that will use gathered metrics to infer and draw logical conclusions from past behavior.
  • Write SQL queries to pull historical data from Policy and Claim Center
  • Collected data needs and requirements by Interacting with other departments.
  • Created various types of data visualizations using Python
  • Communicated the results with operations team for taking best decisions.
  • Data Wrangling and analysis in Python for statistical models like Retention Model
  • Used Logistic Regression, Random Forest, Decision Tree, SVM to build predictive models
  • Created reports/presentations to communicate insights to both technical and non-technical person.

Environment: Python, SQL, Retention model, Logistic Regression, Random Forest, Decision Tree

We'd love your feedback!