Data Scientist Resume
Plano, TX
PROFESSIONAL SUMMARY:
- Data Scientist wif an experience of 5+ years, working through Banking, Technology, and marketing industries.
- Highly skilled in machine learning, data analysis, data visualization and data science methodology.
- Experience using technology to work efficiently wif datasets such as scripting, data cleaning tools, statistical software packages.
- Extensive working experience in data science projects, machine learning algorithms and tools like R, Python, SQL, Hadoop, Tableau and Power BI.
- Experienced wif deep learning algorithms such as ANN, CNN, RNN, LSTM, Keras, TensorFlow.
- Experienced wif machine learning algorithms such as NLP, logistic regression, random forest, Xgboost, KNN, SVM, neural network, linear regression, and k - means.
- Proficiency in developing story boards and advanced visualizations using Tableau, Python and Power BI.
- Enough practical noledge in performing Data Analysis process using Python like Importing datasets, Data wrangling, Exploratory Data Analysis, Model development and Model Evaluation.
- Hands-on experience on Python and libraries like NumPy, Pandas, Matplotlib, Seaborn, NLTK, Sci-Kit learn, SciPy.
- Enough experience in agile methodology and ability to manage all phases of SDLC ranging from requirement analysis, design, development, testing to deployment of a data science project.
- Achievements include creating data regression models to forecast company stock prices wif 30% more accuracy tha historical average and achieved 25% improvement.
- Analyzed pre-existing predictive model for predicting teh conversion rate of customers developed by Advanced Analytics team and re-built predictive model using machine learning algorithms by considering factors dat better influenced teh conversion rate. Increase in teh conversion rate is beneficial for both customers and company.
- Skilled in implementing natural language processing and neural networks through libraries like PyTorch, Keras.
- Skilled in Neural Networks and Deep Learning Models like ANN, CNN, RNN, LSTM etc.
- Improved teh accuracy of teh predictive models from 65% to 86% using support vector machine algorithm.
- Highly qualified in using Microsoft Azure ML Studio, Service now tools for feature engineering, creation, and deployment of models across cloud architecture.
- Experience and noledge in TensorFlow to do machine learning/deep learning package in python.
- Experience in Descriptive Analysis Problems like Frequent Pattern Mining, Clustering, Outlier Detection.
TECHNICAL SKILLS:
Programming Languages: Python, SQL, R
RDBMS: Teradata, Oracle 11g, SQL*Plus, MS Access, SQL developer
Frameworks: Hadoop Ecosystem
Machine Learning algorithms: NLP, Linear Regression, Logistic Regression, Decision Trees, SVM, Random Forest, K Means Clustering
Tools: RStudio, Jupyter notebooks, Informatica Power Center, Teradata, SQL, Toad for Oracle, PowerBI, Tableau, UNIX, Shell scripting, SAS e-miner, Microsoft Azure MLMicroStrategy, HP quality center, MS Visio, Erwin Data Modeler, ServiceNow
WORKING EXPERIENCE:
Confidential, Plano TX
Data Scientist
Responsibilities:
- Analyzed pre-existing predictive model developed by advanced analytics team and factors considered during model development.
- Participated in all phases of data mining; data collection, data cleaning, developing models, validation, and visualization.
- Performed sentiment analysis on multiple products as requested by Line of Business using NLP/NLTK library.
- Analyzed metadata and processed data to get better insights of teh data.
- Created initial data visualizations in tableau to provide basic insights of data to teh project stakeholders.
- Application of various machine learning algorithms and statistical modeling like decision trees, regression models, clustering, SVM to identify Volume using scikit-learn package in Python.
- Conducted regular communications wif leaders of other teams to get better understanding of teh data Confidential a deeper level.
- Performed extensive exploratory data analysis using Teradata to improve teh quality of teh dataset and developed Machine Learning algorithms using Python for predicting teh model quality and created Data Visualizations using Tableau.
- Performed parameter tuning procedures to achieve optimal performance of teh model.
- Worked on Machine learning algorithms like logistic regression, Decision trees, Support Vector Machine and Random forest to achieve best accuracy for teh propensity model.
- Strong practical experience in various Python libraries like Pandas, One dimensional NumPy and Two dimensional NumPy.
- Developed data visualizations in Tableau to display day to day accuracy of teh model wif newly incoming data.
- Hold a point-of-view on teh strengths and limitations of statistical models and analyses in various business contexts and is able to evaluate and TEMPeffectively communicate teh uncertainty in teh results.
- Used TensorFlow/Keras library to build and train deep learning models and fetched good results.
- Published Power BI Reports in teh required origination's and Made Power BI Dashboards available in Web clients.
- Propensity model developed dat was beneficial wif a greater ROI compared to other models. Achieved 0.95 million dollars ROI per cycle wif a cycle duration of one quarter year.
- Implemented complete data science project involving data acquisition, data wrangling, exploratory data analysis, model development and model evaluation.
Environment:: TensorFlow, Advanced SQL, RStudio (ggplot2, choroplethr, dplyr, caret), Python (Pandas, NumPy), Machine Learning (Logistic Regression, Decision trees, SVM, Random forest), PyTorch, Keras, Tableau, ServiceNow, Excel
Confidential, Plano TX
Data Scientist
Responsibilities:
- Collaborate wif business leaders for data initiatives, wif focus on teh use of data to optimize business KPIs such as revenue and circulation, along wif teh team of data professionals wif specific focus on: Analytics & Insight, Data Engineering and Data Science.
- Designed and developed Tableau graphical and visualization solutions wif business requirement documents and plans for creating interactive dashboards.
- Using regularization techniques to solve teh over-fitting problem by reducing loss function either by adding multiple (LASSO or Ridge) or by performing cross validation.
- Used advanced analytical tools and programming languages such as Python (NumPy, pandas, SciPy) for data analysis
- Extensively used PyTorch and Keras to build and train deep learning models.
- Worked wif teh data science team to build and deploy machine learning based models to predict customer churn and optimize customer acquisition using Teradata, Oracle and SQL.
- Created story boards in Tableau and PowerBI for each application usage report categorized country, region, and state wise.
- Created quick start guides and designed product pages for Microsoft applications in teh company portal.
- Constructed and evaluated various types of datasets by performing machine learning models using algorithms and statistical modeling techniques such as clustering, classification, regression, decision trees, support vector machines, anomaly detection, sequential pattern discovery, and text mining from Python libraries (scikits.learn).
- Visualized graphs and reports using matplotlib, seaborn and panda packages in python on datasets for analytical models to no teh missing values, outliers, correlation between teh features.
Environment:: SQL*Plus, Hadoop framework, RStudio (dplyr, caret packages), Python (NumPy, Pandas), Machine learning algorithms (Logistic Regression, Decision trees, SVM), SharePoint Online, PyTorch, Keras, Tableau, Tableau, Excel.
Confidential
Data Analyst
Responsibilities:
- Performed Sentiment Analysis on survey data to inform and shape satisfaction levels of customers those were provided services. Built Tableau dashboard to monitor sentiments in real time
- Designed and automated web API’s to pull data from ServiceNow tables and update cases using python scripts
- Deployed a ML model using python and Flask API to predict teh resolution time and similar ServiceNow cases based on description of a case
- Used various Python libraries like seaborn, scikit-learn, SciPy to visualize, analyze teh data for machine learning.
- Applied deep learning models using python to correctly classify teh category of a new case in ServiceNow based on its short description
- Designed a business solution to automate teh image scrapping process using selenium and python, which downloads teh images of requested invoices from Lexmark perceptive content explorer and converts them into pdf
- Optimized teh code to enhance teh performance by reducing teh estimated total time of 18 days to 10 hours
- Developed and automated python scripts to fetch geocode formatted employee addresses and respective congress representative details like photo
Environment: Spark, Hadoop, AWS, SAS Enterprise Guide, SAS/STAT, SAS/SQL, ORACLE, MS-OFFICE, Python (Scikit-learn, pandas, NumPy), Machine Learning (logistic regression, XP boost).
