Machine Learning Engineer Resume
Montreal, QC
SUMMARY
- Around 4+ years of IT experience as Machine Learning Engineer/Data Science with Bachelor’s degree in Electronics and Communication Engineer including hands on experience leveraging machine learning models and Statistical techniques to uncover insights and drive business solutions.
- Proficient in statistical inference, predictive model selection and algorithm evaluation.
- Experience in working with all statistical distributions like Binomial Distribution, Normal Distribution, Poisson Distribution, T distribution, Bernoulli Distribution, Sampling Distribution, Frequency distribution.
- Experience in statistical data analysis methods like hypothesis testing, Chi - square, T-test, Dimensionality reduction methods like PCA, LDA and feature selection methods.
- Performed preliminary data analysis using descriptive statistics and handled anomalies such as removing duplicates, imputing missing values and dealing with outliers.
- Experience in Marketing Analytics, Credit Risk Analytics, Fraud Analytics, Customer Segmentation and Profiling, customer churn modelling and Recommender Systems.
- Hands on experience in implementing Naïve Bayes and skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, neural networks, Principal Component Analysis, Recommender Systems good noledge on Association Rule Mining.
- Conducted analysis and patterns on customer shopping habits in a different location, different categories and different months by using time series modelling techniques.
- Extensive experience in Text Analytics, developing different Statistical Machine
- Learning solutions to various business problems and generating data visualizations using R, Python, and Tableau.
- Used Pandas, NumPy, Seaborn, SciPy, Matplotlib, Scikit-learn, and NLTK in Python for developing various machine learning algorithms.
- Hands on experience on R packages and libraries like caret, ggplot2, dplyr, magrittr Hmisc, e1071, ROSE,epiR, ggviz etc.
- Experience in Extracting data for creating Value Added Datasets using Python, R and SQL to analyze teh behaviour to target a specific set of customers to obtain hidden insights within teh data to effectively implement teh project Objectives.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, HBase and loaded data into HDFS.
- Experienced in SQL, PL/SQL package, function, stored procedure, triggers, and materialized view, to implement business logic of Oracle database.
- Expert in querying structured and non-structured data with ability to gather insights from data.
TECHNICAL SKILLS
Statistical Software: R, python
Databases: Oracle 12c/11g.
Programming and Scripting Languages: R, Python (Numpy, Pandas, Scipy, Seaborn Scikit-learn, NLTK, Matplotlib), C, C++, Basic Java, Html, PHP using Xampp, SQL, SQL/PLUS, Linux commands Statistical Methods Hypothesis Testing, Confidence Intervals, Correlation and Covariance, Paired and Unpaired Sample Tests, PCA and LDA
Machine learning Algorithms: Regression and Classification models, Decision Trees, Random Forest, KNN, Clustering (K-means), SVM, Bayesian Algorithm, Social Media Analytics, Sentimental analysis, Associative Rule Mining, Time Series Analysis Market Basket Analysis, Bagging, Boosting, Artificial Neural Networks.
PROFESSIONAL EXPERIENCE
Machine Learning Engineer
Confidential, Montreal, QC
Responsibilities:
- Performed data cleaning and imputation of missing values using data pre-processing techniques.
- Worked on customer segmentation on customer database to improve personalized marketing using an unsupervised learning technique -K means clustering.
- Identifying Type 1 and Type 2 error rates to improve a model for predicting customer churn rate using logistic regression.
- Optimized customer retention strategies using teh results from churn models and identified which factors are attracting teh current customers.
- Developed different Machine Learning algorithm to predict customer insight, target marketing, risk management (e.g. predicting equipment failure), and revenue management.
- Research and tackle open-ended data science questions in R using statistical methods and machine learning algorithms such as Regression Analysis, Clustering, Support Vector Machines, Decision Trees and Neural Networks.
- Using Online Survey data performed A/B testing for testing different products.
- Identifying accuracy, precision, sensitivity, specificity, odds ratio using Confusion Matrix.
- Working closely with marketing team to deliver actionable insights from huge volume of data, coming from different marketing campaigns and customer interaction matrices such as web portal usage, email campaign responses, public site interaction, customer specific parameters.
- Hands-on experience in creating Piechart, Barchart, Boxplot, Histogram, Scatter plot for statistical Analysis using R and R package ggplot2.
- Performed Sampling and Resampling techniques to improve accuracy of prediction models.
- Basic understanding of Apache Spark Concepts loaded data into HDFS using Pyspark and performed some RDD actions and transformations on data.
Environment: R, Clustering, Linear and Logistic Regressions Analysis, Singular Vector
Decomposition, Support Vector Machine, Oracle, and Tableau, Neural Networks, Apache Spark.
Machine Learning Engineer
Confidential
Responsibilities:
- Performed data analysis, visualization, feature extraction, feature selection, feature engineering using python and R.
- Developed personalized products recommendation with Machine Learning algorithms including Collaborative filtering and Gradient Boosting Tree, to better meet teh needs of existing customers and acquire new customers.
- Perform Exploratory Data Analysis and TEMPPrincipal Component Analysis on noisy information scraped from teh internet present findings by developing an interactive application for data visualization.
- Evaluated models using Cross Validation, bootstrapping and ROC curves.
- Experience in areas of normalization, regularization, classification, model optimization, hyper-parameter tuning.
- Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing datamining and reporting solutions dat scales across massive volume of structured and unstructured data.
- Experience in using Hadoop (pig and Hive) for basic analysis and extraction of data in teh infrastructure to provide data summarization.
- Experience in using visualization tools like Tableau, ggplot2 and matplotlib for creating dashboards.
- Good understanding of AWS (Amazon Web Services) S3, EC2, VPC, load balancing, auto scaling and Dynamo DB and AWS machine learning concepts.
- Good noledge on NLP techniques like text preprocessing, topic modelling (LDA, LSI) and text generation using Markov Chain Models.
Environment: R, Python, Decision Tree, Random Forest, Logistic regression, Naïve Bayes, Recommendation Systems, Gradient Boosting Tree, TEMPPrincipal Component Analysis, Linux, SQL, Hive, Pig, AWS, ggplot2 and Tableau.
Data Analyst
Confidential
Responsibilities:
- Understanding of non-traditional systems of Big Data and NoSQL and of various data source/dataset types.
- Experiment, build, evaluate & optimize models, contribute building Data Science and Analytics
- Worked with python SciPy and NumPy libraries for performing statistical analysis.
- Perform Data Transformation method for Rescaling and Normalizing variables.
- Extracting teh source data from Oracle tables, MS SQL Server, sequential files and excel sheets
- Good noledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name Node, and Map Reduce concepts.
- Identifying and executing process improvements, hands-on in various technologies such as Oracle, Informatica, and Business Objects.
- Designed teh prototype of teh Data mart and documented possible outcome from it for end-user and involved in business process modelling using UML.
- Identifying outliers, analysing and interpreting trends and patterns in complex data sets.
- Worked and extracted data from various database sources like Oracle, SQL Server, DB2, regularly accessing JIRA tool and other internal issue trackers for teh Project development.
- Customer segmentation and characterization to predict behaviour using clustering techniques, analyzing promoters and detractors (defined using Net Promoter Score).
- Utilized domain noledge and application portfolio noledge to play a key role in defining teh future state of large business technology programs.
- Worked on Fraud Methods like Superimposed Fraud, Subscription fraud, Technical Fraud, Internal Fraud, Social Engineering.
Environment: R, Python, Clustering, Regressions Analysis, Singular Vector Decomposition -SVD, HDFS, HBase, Hive, pig, Oracle, Informatica, MDM, Business Objects, and Tableau.