Senior Associate Analyst/machine Learning Engineer Resume
New York, NY
SUMMARY
- Professional Data Scientist with 2+ years of experience in Data Science and Analytics including Machine Learning /Deep Learning/Data mining and Statistical Analysis.
- Involved in the entire data science project life cycle and actively involved in all the phases including data extraction, data cleaning, statistical modeling, and data visualization with large data sets of structured and unstructured data.
- Experienced with machine learning algorithm such as logistic regression, Ensemble methods, XGBoost, KNN, SVM, neural network, linear regression and Clustering algorithms like k - means, DBSCAN etc.
- Involved with Recommendation Systems such as Collaborative filtering and content-based filtering.
- Experienced with Convolutional Neural Networks (CNN’s) and Recurrent neural networks (RNN’s).
- Experienced applying machine learning and deep learning techniques to build models and analyze large scale data.
- Experienced in Big Data with Hadoop, HDFS, MapReduce, Spark, Hive etc.
- Strong skills in Linear algebra, Probability theory, and Calculus.
- Strong skills in statistical methodologies such as A/B test, experiment design, hypothesis test, ANOVA.
- Solid ability to write and optimize diverse SQL queries, working knowledge of RDBMS like MySQL, NoSQL databases like MongoDB, HBase and Cassandra.
- Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0, Jupyter Notebook 4.X, R 3.0 (ggplot2, Caret, matplotlib).
- Experienced in data structures algorithms such as linked lists, trees and graphs etc.
- Developed API libraries and coded business logic using Node.JS, Python and designed web pages using PHP, python, Django, flask, HTML, AJAX, Angular.JS, Node.JS etc.
- Experience in building models with Deep Learning frameworks like TensorFlow, Keras and PyTorch etc.
- Experience in using Optimization Techniques like Gradient Descent, Stochastic Gradient Descent, Adam, Adadelta, RMSprop, Adagram.
- Experience in visualization tools like Tableau for creating dashboards.
- Experienced with AWS cloud services: EC2, EMR, RDS, Redshift, Sage Maker etc.
TECHNICAL SKILLS
Languages: Python, C, C++, Java, Node.js, Angular.Js etc.
Databases: MySQL, MongoDB, Cassandra, Oracle
ML Toolbox: Scikit-learn, NumPy, Pandas, XGBoost, SciPy, NLTK, Matplotlib, Seaborn, Tableau, Keras, TensorFlow, PyTorch etc.
Big Data technologies: Spark, Hadoop, Hive, MapReduce, PySpark etc.
Techniques: Logistic Regression, Linear Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors, Random Forests, Gradient Boost decision TreesStacking Classifiers, Cascading Models, Naive Bayes, K-Means ClusteringHierarchical and DBSCAN Clustering, PCA, T-SNE, Data Standardization, L1 and L2 Regularization, Loss Minimization, Hyper Parameter Tuning, Performance Measurements, Feature Engineering, Matrix Factorization, Model Calibration, productionizing Models, A/B Testing, Hypothesis Testing, Cross Validation.
Deep Learning: Artificial Neural Networks, CNN’s, Multi-Layer perceptron’s, RNN’s, LSTM, GRUSoftMax Classifier, Backpropagation, Activations, Dropout, Optimization, Vanishing and Exploding Gradient, Optimized weight Initializations, Gradient Monitoring and Clipping, Batch Normalization.
PROFESSIONAL EXPERIENCE
Confidential, New York, NY
Senior Associate Analyst/Machine Learning Engineer
Responsibilities:
- Worked on imbalanced datasets and used the appropriate metrics like precision & recall while working on the imbalanced datasets.
- Participated in all phases of machine learning models such as data collection, data cleaning, data analysis, developing models, validation, visualization and deployment.
- Implemented Classification using supervised algorithms like Logistic Regression, SVM, Decision trees, KNN, Naive Bayes.
- Applying deep learning algorithms on knowledge graphs to classify the record types.
- Performed feature engineering including feature intersection generating, feature normalize and label encoding with Scikit-learn preprocessing.
- Used pandas, NumPy, Seaborn, SciPy, matplotlib, Scikit-learn, Glove, NLTK in Python for developing various machine learning algorithms.
- Working with mark logic database for querying the knowledge graphs.
- Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node, and MapReduce concepts.
- Developing UI interfaces for information management with Angular JS and Node. Js etc.
- Used Grid Search and Random Search to evaluate the best hyper-parameters for my model and K-fold cross-validation technique to train my model for best results.
- Addressed overfitting by implementing the algorithm regularization methods like L2 and L1 and dropouts in neural networks.
- Worked with different performance metrics such as f-1 score, precision, recall, log-loss, accuracy and AUC etc.
- Developed low-latency applications and interpretable models using machine learning algorithms.
- Experienced with Computer Vision applications with Convolutional Neural Networks (CNN’s) with the frameworks such as Keras, TensorFlow etc.
- Interaction with Business Analyst and other Data Architects to understand Business needs and functionality for various project solutions.
Confidential
Data scientist
Responsibilities:
- Developed recommender Systems in production for Personalized push notifications using the classical machine learning algorithms.
- Performed data cleaning and feature selection and modelling using MLlib package in PySpark.
- Worked extensively with PySpark structured API, PySpark SQL, Aggregation in PySpark.
- Developed the text mining algorithms in spark to analyze the large structured and unstructured data with Latent Dirichlet Allocation, text classification and sentiment analysis.
- Used Cosine similarity & Pearson correlation algorithms to finding the similar items and recommend the top n items to users.
- Developed a sentiment analysis model to find out the user sentiment about the user comments using machine learning algorithms & deep learning RNN’s.
- Worked with text feature engineering techniques n-grams, TF-IDF, word2vec etc.
- Performed univariate and multivariate analysis on the data to identify any underlying pattern in the data and associations between the variables.
- Addressed overfitting by implementing the algorithm regularization methods like L2 and L1 and dropouts in neural networks.
- Used Principal Component Analysis (PCA) and T-SNE in feature engineering to analyze high dimensional data.
- Experienced in Big Data with Hadoop, HDFS, MapReduce, and Spark.
- Extracted data from HDFS using Hive and performed data analysis using Spark with Scala, pySpark, Redshift, and feature selection and created nonparametric models in Spark.
- Used clustering techniques like DBSCAN, K-means, K-means++ and Hierarchical clustering for customer segmentation.