Machine Learning Engineer Resume
Rtp, NC
SUMMARY:
Having over 3 years of experience as a Data Scientist along with 7 years in IT performing data - driven analysis.
TECHNICAL SKILLS:
Machine Learning: Regression analysis, Classification analysis, K-NN, Decision Tree, K Means clustering, Support Vector Machine (SVM), Bagging Algorithms (Random Forest), Boosting Algorithms(AdaBoost, GradientBoost, XGBoost and CatBoost), Genetic Algorithms like Auto ML, H2O, TPOT. Deep Learning, CNN, RNN, ANN, Transfer Learning (ULMFIT, Bert, XLNet).
Python Libraries: NumPy, Pandas, SciPy, Sci-kit Learn, Seaborn, Pyspark, Statsmodel, TensorFlow, Keras, Matplotlib, BS4, Selenium pymongo etc.
Programming: Python, SQL (MySQL, PostgreSQL), MongoDB, Hadoop/Hive, Pyspark, Git, Flask.
Cloud Services: AWS (EC2, S3), Docker.
Reporting & Visualization Tools: Seaborn, Matplotlib, Plotly, Tableau.
NLP Tools: Count-Vect, Hash-Vect, LDA modeling, TF-IDF, Word2Vec, Doc2Vec, Bleu.
Statistics: Hypothetical Testing, Chi-Square, Confidence Intervals, Principal Component Analysis (PCA), Cross-Validation, Correlation.
EXPERIENCE:
Machine Learning Engineer
Confidential, RTP, NC
Responsibilities:
- Applied advanced machine learning and deep learning algorithms and methods in 5 different projects.
- Implemented data cleaning, feature engineering, pattern recognition, recommender systems, full stack python clustering code creation.
- In a wide range applied different machine learning algorithms, supervised and/or unsupervised learning, genetic algorithms (Catboost, H2O, Auto ML etc.), deep learning, pyspark ALS, transfer learning, clustering algorithms.
- Used data visualization tools, matplotlib, seaborn, plotly and Tableau to create dashboards and presentations.
Confidential
Responsibilities:
- Confidential Classic Confidential project by applying machine learning algorithms.
- Conducted advanced queries and fetched the data from Oracle Database and Hive.
- Feature engineered, handled missing values and created new features.
- Used Boosting and Bagging algorithms to find the pattern between the features.
- Applied genetic algorithms and pyspark for the predictions.
Confidential
Responsibilities:
- Analysis of Confidential Business Critical Services and creation ML algorithm.
- Analysis of project profitability with Confidential customers
- Feature engineered, handled missing values and created new features.
- Deep analysis of Confidential using python visualization libraries and Tableau and present the results.
- Applied machine learning algorithms to create a model. Catboost algorithms with 0.94 R2.
- Applied multinomial classification to predict the manager based on the project profitability, 0.75 accuracy score.
Confidential
Responsibilities:
- Recommending the most sold products to the customers applying collaborative filtering with pyspark ALS method.
- Feature engineered, handled missing values and created new features.
- Deep analysis of Confidential products, customers and sale patterns using python visualization libraries and Tableau and present the results.
- Applied pyspark Alternating Least Squares method to create the model and got 0.16 RMSE.
Confidential
Responsibilities:
- Feature engineered, handled missing values and created new features.
- Applied advanced NLP techniques during the text normalization and vector representation of the text.
- Applied deep learning techniques as well as transfer learning techniques to develop the model.
- Created a rasa chatbot as an alternative solution.
- As a Proof of Concept, compared domain classifier’s performance against the rasa chatbot.
- Created flask API as well as docker container for the domain classifier and pushed it to bitbucket.
Confidential
Responsibilities:
- Created end to end prototype by fetching the data from MongoDB, filtered the outliers, applied unconventional clustering techniques and created cluster fingerprints, calculated the semantic meaning of SRs and clusters, stored the results to MongoDB again.
- Created flask APIs and full stack python codes to cluster the SR fingerprints.
- Analyzed 3rd party product and presented the results to the decision makers to renew the $1M contract or not.
- Created evaluation matrices and methods for different NLP systems and analyzed their performance using the semantic meaning as well as ngram precision techniques.
Data Scientist
Confidential, Cedar Park, TX
Responsibilities:
- Developed machine learning models to provide industrial solutions in construction and marketing,
- Implemented data models, algorithms to develop machine learning solutions,
- Organized and conducted end-to-end machine learning pipeline, with a focus on data acquisition, data preparation, model training and optimization, visualization techniques to inform stakeholders and model deployment,
- Analyzed Customer Feedbacks for classification using NLTK, TF-IDF, Count-Vect and Hash-Vect NLP techniques,
- Engaged in big data ecosystems such as Hadoop/Spark for data import, cleaning, wrangling and machine learning model implementations, and experienced in data modeling, SQL and database knowledge,
- Consulting companies and individuals for their data science problems.
Confidential
Responsibilities:
- Conducted advanced queries and fetched the data from database,
- Feature engineered, handled missing values and created new features by applying APIs, web scraping and combining existing ones,
- Utilized Python matplotlib and seaborn libraries for visualization,
- Applied statistical tests to determine the predictive power and association amid the features,
- Selected the most significant features to decrease complexity and processing time and increase robustness of machine learning models in R and Python,
- Created different models including Regression, Random Forest, GradientBoost, XGBoost as well as genetic algorithms (Auto ML(0.92 R2), TPOT, H2O(0.90 R2) and CatBoosting(0.88 R2(Model Deployed)).
Confidential
Responsibilities:
- Conducted the sentiment analysis on the customer review with Python based on the Rating Scala,
- Feature engineered, handled missing values, created few features by applying combining existing ones., NLTK, TF-IDF and Count-Vect,
- Applied EDA including the Word Cloud depiction to get better visualisations,
- Created different models including Logistic Regression, Random Forest, Naïve Bayes, Gradient Boost, XGBoost,
- Applied oversampling techniques to increase the data points of minority class with SMOTE,
- Applied PCA & SVD to decrease the dimensionality of the data set,
- Used genetic algorithms like Catboost and TPOT (Increased precision for the minority class up to 2.5 times with 0.69)
- Compared their performances in terms of precision, recall and response time.
Confidential
Responsibilities:
- Feature engineered, handled missing values, advanced visualization of imbalanced data,
- Applied dummy variable creation for the non-numerical values in the data set,
- Applied machine learning algorithms, Logistic Regression, Random Forest Classifier with hyper parameter tuning,
- Applied Synthetic Minority Oversampling Technique for the imbalanced data and applying the same (0.86 Recall with only 18 positive samples)
Confidential
Responsibilities:
- Acquired the data by Python Selenium scraping from the biggest vehicle listing website in Turkey (~ 200K vehicles),
- Feature engineered, feature reformatting, scaling.
- Applied advanced visualization techniques as well as applying One Hot Encoding on non-numerical categorical features.
- Created different models like Linear Regression (Regularized with Elastic Net), Decision Tree, Random Forest, AdaBoost and Gradient Boost (%91 R-Squared with Random Forest)
Confidential, Austin, TX
Responsibilities:
- Created different models like Linear Regression (Regularized with Lasso), StatsModel, Decision Tree, Random Forest, AdaBoost Gradient Boost, Random Forest (%90 R-Squared for Gradient Boost)
Maritime Data Analyst
Confidential
Responsibilities:
- Analyzed the options (multi criteria decision optimization etc.) of Confidential initiatives for the North African Countries, ended up with creation of Expeditionary Training Teams for North African Countries Naval Forces to enhance the Security in Mediterranean Sea.
- Contributed to the planning team as syndicate leader and data analyst, participated in every level meeting, prepared and presented on-demand reports (Best Course of Actions) to the decision makers to ensure appropriate measures to be taken.
- Analyzed all dimensions of Operational Area for Confidential Operations in Afghanistan and optimized %80 of the planning data for the planning staff.
Data Analyst/Operations Research Analyst
Confidential
Responsibilities:
- Assumed roles at different stages like Modelling and Simulation Software Engineer, Operations Research Analyst and Chief Data Analyst.
- Assumed plentiful projects in relation with Data Analysis, Optimization and Simulation & Modelling.
- Be part of establishing a new approach of Operations/Training Assessment and Evaluation system for Naval Forces to analyze more than 100 warships operations/training cycle and committed increasing effectiveness of Naval Forces up to %91.
- Used geo-spatial data to visualize war ships’ sail patterns to to optimize the radar/sonar coverage.
- Contributed to the Battle Readiness Evaluation System by creating a new reporting mechanism for each platfor