- Domain knowledge and experience in Aerospace, Supply chain, Retail, Telecom, CPG, Pharmaceuticals, Healthcare, and Sports Analytics industries.
- Proficient in managing entire data science project life cycle and actively involved in all the phases of project life cycle including data acquisition, data cleaning, data engineering, features scaling, features engineering, statistical modeling, testing and validation and data visualization.
- Experience in using various packages in Python and R like ggplot2, twitteR, NLP, pandas, numpy, seaborn, scipy, matplotlib, scikit - learn, Beautiful Soup, NLTK, Gensim
- Knowledge on Natural Language Processing (NLP) algorithm and Text Mining.
- Experienced in using Python (Juypter/iPython Notebook) and R to perform statistical analysis and to implement machine learning algorithms.
- Adept in exploratory data analysis (EDA) and predictive modeling.
- Highly skilled in using visualization tools like Tableau, ggplot2, matplotlib, Power BI for creating dashboards.
- Excellent understanding Agile and Scrum development methodology
- Managed teams and led regular discussion on landscape optimization using quantitative and qualitative analysis
- Effective team player with strong communication and interpersonal skills, possess a strong ability to adapt and learn new technologies and new business lines rapidly
- Competent in implementing Root Cause Analysis (RCA) to minimize risks associated with processes in manufacturing environment
- Demonstrated ability to use business (non-technical) language in delivering subject knowledge and explaining technical concepts and requirements to various projects teams and influence others
Machine Learning: Classification, Regression, Random Forest, Clustering (k-means, Hierarchical), Deep Learning, NLP, NLTK, Sentiment Analysis, Decision Trees, Neural Networks
Programming Languages: Python, R, SQL, Java
Data Visualization and ML libraries: Tableau, Matplotlib, Power BI, seaborn, ggplot2, Scipy, Numpy, Pandas, Scikit-Learn
BeautifulSoup, twitter, scikit-surprise:
Certifications: SAP Successfactors ONB, RCM, LMS, Workforce Analytics Q4 2018, Enterprise Design Thinking, Google Analytics
Tools: Used: Jupyter Notebook, R Studio, SQLite 3, Microsoft SSMS, JIRA, Service Now, HP ALM, Remedy,Asana, CITRIX, Certido
Familiarity: Apache Hadoop, Hive, M, DAX, AWS S3, AWS Lambda, Azure Cloud, FireBase
OS/ Environment/ IDE: Windows, Mac, Linux, Anaconda, Oracle VirtualBox, IntelliJ IDEA
Mobile: Android Studio, Xcode
Methodology: Agile, Scrum, Enterprise Design Thinking, Design Thinking Essentials for AI
Confidential, Atlanta GA
Data Scientist/ Analytics Consultant
- This is a sports analytics company where I oversaw all the data needs of the scoutSMART app to help college football coaches find their next recruits based on data analytics
- Created a recommendation system using Python with MS SQL using cosine similarity based on coach’s preference reducing churn rate by over 10% hosted on Azure cloud
- Segmented the players based on groups using K-means Clustering
- Built classification models include: Logistic Regression, SVM, Decision Tree, Random Forest to predict Customer Churn Rate.
- Owned and optimized the database using exploratory data analysis with R, Python and SQL improving data accuracy by 21%
- Created a power BI dashboard that delivered business ready insights thereby increasing consumer retention by 15%
- Re-engineered the analytical scoring algorithm of college athletes using regression analysis for over 5000 junior college players
- Analyzed demand forecasting on coach search data for marketing and targeted mailing to increase coach interaction by 12%
- Automated the upload of over 118,000+ player stats with a considerable saving of over $55,000
- Scraped over 100,000+ player statistics and 10000+ twitter information using twitteR library and Beautifulsoup
- Liaison between the various business stakeholders informing them of regular progress updates and milestones
- Led weekly brain storming session for product development and business strategy in tandem with industry trends
Environment: Python 3.x (Scikit-Learn/Scipy/Numpy/Pandas/NLTK/Matplotlib/Seaborn), Tableau, Azure Cloud, SQL, T-SQL, SSMS 18, Machine Learning (Regressions, KNN, SVM, Decision Tree, Random Forest, Collaborative filtering, Ensemble), NLP, Recommendation systmes, Agile/SCRUM, R(TwitteR, ggplot2)
Graduate Teaching Assistant / Graduate Research Assistant
- Achieved prediction accuracy of over 67% for multi-variate Time Series Analysis of weather data using Page Rank Algorithm
- Investigated comparative study of clustering methods like k-means, Hierarchical clustering, DDC clustering
- Provided experimental proof of DDC method being the best with 71.5% accuracy
- Ran 10 iterations of k-means for each large time series dataset to smoothen the outliers
- Conducted extensive experiment on huge UCR time-series datasets
- Improved grades of 30 undergraduate students with subjects like PCS, Algorithms, Database, Basics of C, C++, and Python
Data Science Consultant
- Implemented word cloud and sentiment analysis to understand additional medicine side effects and improved customer satisfaction.
- Applied clustering algorithms i.e. Hierarchical, K-means using Scikit and Scipy.
- Performs complex pattern recognition of automotive time series data and forecast demand through the ARMA and ARIMA models and exponential smoothening for multivariate time series data.
- Delivered and communicated research results, recommendations, opportunities to the managerial and executive teams, and implemented the techniques for priority projects.
- Designed, developed, and maintained daily and monthly summary, trending, and benchmark reports repository in Tableau Desktop.
- Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each supplementary tablet to be used with medication
- Designed and implemented end-to-end systems for Data Analytics and Automation, integrating custom visualization tools using R, Tableau, Power BI
- Provided actionable insights using Tableau for SAP Time Management and SAP ECM to achieve 11% incident volume reduction
- Recommended the design of L1 support manual for ticket volume reduction and higher resource utilization using exploratory data analysis of historical incident data of over 125000+ tickets in past 5 years using Excel
- Implemented over 46,000 custom configurations for over 65 countries supporting 24/ 5 with 0 major issues for 2 years