Machine Learning / Data Scientist Resume
Parsippany, NJ
PROFESSIONAL SUMMARY:
- Over 5 years’ experience in machine learning / data science for Automobile, Electronics, Retail, and Manufacturing industry. Completed developing deep learning / machine learning software using Artificial Neural Network (ANN), Support Vector Machine (SVM), Linear and Logistic Regression methods, etc. for various applications including speech recognition in deep learning Long Short - Term Memory (LSTM) models. Also working on benchmarking various deep learning code for image processing, voice recognition, mechanical equipment sensor data processing, etc.
- Experience in large datasets of structured and unstructured data, data visualization, data acquisition, predictive modeling, NLP / NLU / NLG / AI / Machine Learning / Deep Learning / Computer vision/ Probabilistic Graphical Models / Inferential statistics / Graph / Apache Spark / Data Validation.
- Experience in deep learning Convolutional Neural Network based image processing for images.
- Experience with deep learning LSTM and RNN based speech recognition using tensorflow.
- Experience in data mining algorithms and approach with good design techniques.
- Experience in data preprocesing, developing different statistical machine learning model and data mining solutions to various business, generating data visualizations using Python, R, Tableau, Microsoft Power BI, version control with GIT.
- Involved in all the phases of project life cycle including data acquisition (sampling methods: SRS/stratified/cluster/systematic/multistage), power analysis, A/B testing, Hypothesis testing, EDA (Univariate & Multivariate Analysis), data cleaning, data imputation (outlier detection via chi square detection, residual analysis, multivariate outlier detection), data transformation, features scaling, features engineering, statistical modeling both linear and nonlinear, dimensionality reduction using Factor Analysis, testing and validation using ROC plot, K- fold cross validation, statistical significance testing, data visualization.
- Python, Numpy, Scikit-Learn, genism, NLTK, Tensorflow, keras.
- Experience in Machine Learning, Statistics, Regression- Linear, Logistic, Poisson, Binomial.
- Highly skilled in designing visualizations using Tableau software and Storyline on web and desktop platforms, publishing and presenting dashboards.
- Proficient in Machine Learning techniques (LDA, Decision Trees, Linear, Logistics, Random Forest, SVM, Bayesian, XG Boost, K-Nearest Neighbors, Clustering) and Deep Learning techniques (CNNs, RNNs) and Statistical Modeling in Forecasting/ Predictive Analytics, Segmentation methodologies, Regression based models,Ensembles.
- Analyzed data using R, Hadoop and queried data using structured and unstructured databases.
- Strong programming expertise in Python and strong in Database SQL.
- Worked and extracted data from various database sources like Oracle, SQL Server.
- Solid coding and engineering skills in Machine Learning.
- Experience with file systems, server architectures, databases, SQL, and data movement (ETL).
- Proficient in Python, experience building, and product ionizing end-to-end systems.
- Knowledge of Information Extraction, NLP algorithms coupled with Deep Learning.
TECHNICAL SKILLS:
Machine Learning, Deep Learning, Computer Vision, Reinforcement Learning, Data Science, Splunk ML Toolkit, Alteryx Machine Learning, Python, TensorFlow, Pandas, Keras, Theano, Octave, PyTorch, NumPy, SciPy, OpenCV, Point Cloud Library, YOLO, FaceNet, SQL, Matplotlib, Seaborn, OpenAI Gym, AWS, R, Azure ML, driverless AI, MatLab, Scikit-learn, pandas, project engineering, Spark, Linux, Pthread, OpenMP
EXPERIENCE:
Confidential, Parsippany, NJ
Machine Learning / Data Scientist
Responsibilities:
- Recommended workflow, equipment, and process changes based on 6S and lean principals. Improved vendor performance by creating the scorecard, accelerated material flow by proposing statistical sampling plan and SOP, increased storage & processing capacity by re-slotting, predicted demand volume by SQL queries and deep learning algorithms. All plans boosted the productivity by 13.2%.
- Worked on different data formats such as CSV, JSON, XML and performed Machine Learning algorithms in Python and Deep Learning techniques such as LSTM.
- Set up storage and data analysis tools in Amazon Web Services cloud computing infrastructure.
- Used pandas, NumPy, seaborn, SciPy, matplotlib, scikit-learn, NLTK in Python for developing various machine learning algorithms.
- Used techniques in NLP like Noise Removal, Lemmatization, Stemming, POS Tagging, Bag of Words, Topic Modelling, TF-IDF, word2vec.
- Development and Deployment using Flask.
- Worked as Data Architects and IT Architects to understand the movement of data and its storage.
- Performed Data Cleaning, features scaling, features engineering using Pandas and NumPy packages in python and build models using deep learning frameworks.
- Implemented application of various machine learning algorithms and statistical modeling like Decision Tree, Text Analytics, Sentiment Analysis, Naive Bayes, Logistic Regression and Linear Regression using Python to determine the accuracy rate of each model.
- Implemented Agile Methodology for building an internal application.
- Extracted the source data from Oracle tables, MS SQL Server, sequential files and excel sheets.
- Developed and Data Dictionary to create metadata reports for technical and business purpose.
- Developed MapReduce/Spark Python modules for machine learning & predictive analytics in Hadoop on AWS.
- Extracted data from HDFS and prepared data for exploratory analysis for data munging.
- Developed and established the near miss incident management process and the control tools for the domestic and international merchandise discrepancy management. Conducted comprehensive reviews to identify root causes, analyze risks. Created near miss incident management KPI dashboard to identify opportunities for improvement and provided value-added recommendations.
- Managed all aspects of the audit in accordance with the compliance and regulatory. Resolved the audit findings with relevant stakeholders in a timely manner.
Confidential, Piscataway, NJ
Machine Learning / Data Scientist
Responsibilities:
- Applied lean manufacturing theory to the workshop and warehouse. Recreated production plan, material requirements plan, capacity requirement plan, research and development document of the packaging process and scheduled the bottlenecks.
- Involved in defining the source to target data mappings, business rules, data definitions
- Involved in defining the business/transformation rules applied for sales and service data.
- Worked with project team representatives to ensure that logical and physical ER/Studio data models were developed in line with corporate standards and guidelines.
- Define the list codes and code conversions between the source systems and the data mart.
- Responsible for defining the key identifiers for each mapping/interface.
- Used Python, R, SQL to create statistical algorithms involving Multivariate Regression, Linear Regression, Logistic Regression, PCA, Random Forest models, Decision trees, Support Vector Machine for estimating the risks of welfare dependency.
- Implemented machine learning projects using deep learning, predictive analytics, and image recognition.
- Worked on a project using deep learning for object and facial recognition using YOLO and FaceNet.
- Used Python development skills to assist in building front end AI applications as well as for running machine learning models.
- Tested open source Gatlin application frameworks for a client. Performed load and performance testing.
- Used Swagger to manage REST services for a client application.
- Gained some experience with Apache Maven for deploying Java applications.
- Used PostgreSQL to store data.
- Implementation of Metadata Repository, Maintaining Data Quality, Data Cleanup procedures, Transformations, Data Standards, Data Governance program, Scripts, Stored Procedures, triggers and execution of test plans.
- Remain knowledgeable in all areas of business operations to identify systems needs and requirements.
- Document the complete process flow to describe program development, logic, testing, and implementation, application integration, coding.
- Targeted the potential customer with the sales team, forecasted the future demand, controlled the inventory, designed the logistic distribution, and improved the yield rate by 7.8%.
Confidential, Detroit, MI
Machine Learning / Data Scientist
Responsibilities:
- Pre-processed the 1.52G paralleled time series system categorical sequence, comprehensively evaluated existing classification techniques and demonstrated the necessity for developing a matrix representation for categorical sequence to take advantage of the image processing techniques. This novel idea improves accuracy from 45% to 95%.
- Performed data cleaning, features scaling, features engineering using pandas and NumPy packages in python and build models using SAP predictive analytics.
- Used R and Python for exploratory data analysis, A/B testing, pySpark, HQL, AWS Redshift, ANOVA test and Hypothesis test to compare and identify the effectiveness of Creative Campaigns.
- Gather data from different formats like XML, SQL of different platforms etc.
- Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in Python 3. x/ R.
- Replacement of missing data and perform a proper EDA, univariate and bi-variate analysis to understand the intrinsic effect/combined effects.
- Used Python 3.x / R on different data transformation and validation techniques like Dimensionality reduction using PCA and Factor Analysis.
- Encoding the text documents as feature vectors and use algorithms to classify the text based on their polarity, also verified the incremental learning to train and topic modelling to classify into different categories.
- Used Python 3.x / R to develop many other machine learning algorithms such as decision tree, linear/logistic regression, multivariate regression, natural learning processing, naive bayes, random forests, gradient boosting, XG boost, K-means, and KNN based on unsupervised/supervised model that help in decision making using Keras, TensorFlow and Sklearn.
- Performed model Validation using test and Validation sets via K- fold cross validation, statistical significance testing.
- Performed metric evaluation via regression (RMSE, R2, MSE etc), classification (accuracy, precision, recall, concordance, discordance etc), threshold calculations using ROC plot.
- Used predictive analytics and machine learning algorithms to forecast key metrics in the form of designed dashboards on Tableau.
- Provided data and analytical support for the company’s highest-priority initiatives.
- Created impact documents specifying changes introduced as part of the program and lead the business process team.
- Work with big data consultants to analyze, extract, normalize and label relevant data using Statistical modeling techniques like Support vector machine and neural networks.
Confidential
Data Analyst
Responsibilities:
- Performed ETL and business intelligence techniques to transform company practices into fresh, cost-effective solutions leading to more efficient operations.
- Led cross-functional teams in the development, documentation, and delivery of process innovations driving the attainment of business goals.
- Worked with UX/UI designers to change the layout of the landing pages to increase conversions by 34%.
- Redefined procurement strategies by deconstructing supply chain complexity.
- Conducted operational research techniques and dynamic programming strategies saving purchasing cost by 24.6%.
- In O2O mode: Compiled marketing content through media platform and making H5 page. Planned activities as well as cooperated with NGO like Greenpeace.
- Defined performance metrics for online channels by the marketing conversion funnel: number of visitors, time on page, inquiries and sales. Analyzed all on site activities (click-through rates, conversion rates, cost-per-clock, and cost per acquisition)
- In B2B mode: Planned activities and coordinated with different companies.