- Data scientist with experience in data collection, data manipulation, visualization, building statistical and machine learning models.
- Participated in various phases including data collection, data mining, data cleaning, data visualization, developing models, improvising models and presenting results.
- Over 7+ years of experience in the field of data science, data analytics, data visualization, business intelligence and data mining.
- Excellent hands on experience in building time series models like ARMA , ARIMA for predictive analytics and forecasting.
- Proficient with statistical technique’s like linear regression , multiple regression and ANOVA .
- Experience in developing machine learning models using decision trees, deep learning, support vector machines, clustering, Bayesian networks and reinforcement learning.
- Experienced in SSIS for ETL processes to ensure implementation of event handlers, checkpoints and handlers.
- Experience working with RDBMS systems such as oracle and MySQL .
- Expertise in python, R and respective libraries.
- Experienced in Natural Language processing and its related concepts.
- Designed and developed dashboards in tableau .
- Certified supply chain professional from IIT Delhi.
- Hands on experience working a DBA, User Administration, maintenance using Oracle 10g and 11g.
- Analyzed slot machine data and user data in WEKA to provide a useful insight on slot machine complexity vs usability.
Primary Languages: Python, R
Database: MySQL, Microsoft SQL Server, Oracle DB
ETL Tool: SSIS
BI Tools: Tableau, Oracle BI
Data analysis tools: WEKA, MATLAB
Python/R libraries: Numpy, Scipy, Shiny, Pandas, SQLAlchemy, matplotlib, scikit - learn, TensorFlow, Scrapy, Seaborn, Basemap
ML and Statistical Algorithms: Decision trees, ANN, Deep learning, Bayesian networks, reinforcement learning, support vector machines, logistic regression, data mining methods and cluster analysis.
Other: Natural Language processing (NLP), Hidden Markov Models, Map Reduce, Data Structures and algorithms, git, Jupyter Notebook and MS Excel
Confidential, Omaha, NE
- Build time series predictive models for solar and load forecast using geospatial and energy data.
- Applied Moran’s Index to spatial autocorrelation for a multi variate or multi-dimensional GIS data.
- Applied Hidden Markov Model in reinforcement learning and spatial pattern recognition.
- Test for stationarity of the data set by visualizing rolling statistics and applying Dickey-Fuller test using python statistical package stats models.
- Develop forecast models using statistical methods like Auto Regressive Integrated Moving Averages (ARIMA) and Auto-correlation function (ACF).
- Evaluated time series forecast models using statistical evaluation metrics like Mean absolute percentage error (MAPE) and Root Mean squared error (RMSE).
- Applied stochastic process to model observed time series data.
- Worked on many statistical methods like correlation (Pearson, Spearman), data distributions and descriptive statistics.
- Build linear regression models for predictions and used linear methods for statistical significance tests and correlations in R.
- Build machine learning model for market segmentation using k-means clustering analysis in python with a 95.5% accuracy.
- Build a predictive model to improve sales hit ratio using decision trees (ID3) using Weka.
- Predicted target variable on test data and created confusion matrices, AUROC curves.
- Worked on relational, graph and document stories.
- In-depth knowledge in SQL databases, tables, stored procedures, triggers, views, user defined data types and functions.
- Experience with querying relationship using graph database neo4j.
- Design and visualize interactive results using tableau to publish dashboards.
Confidential, Las Vegas, NV
- Define the problem, evaluate available data to develop and implement descriptive, predictive and prescriptive solutions using statistical and machine learning models.
- Build hypothesis and strategically participate in accurate data collection from various sources to support analytical needs.
- Summarize, aggregate and group data using Python Pandas to find the data structure and data distributions.
- Decide sampling using Numpy and remove features with low variance using scikit-learn
- Visualize data using matplotlib, seaborn, ggplot and geoplotlib to generate different graph types like attribute histograms, pairwise scatter plots, bubble charts and combination plots.
- Worked on text parsing, NLP and stemming using python package NLTK.
- Handle all Preprocessing tasks like data cleaning, integration, transformation, reduction and discretization using data mining tools like Rapid miner, WEKA, R-Programming tool, NTLK, Knime, pandas.
- Assisted ETL team with mapping, sessions and workflows using SSIS.
- Experience in developing time series, regression, decision trees, artificial neural networks, Bayesian networks, deep learning, support vector machines and rule-based machine learning algorithms using Python and R.
- Evaluate, test statistical model and machine learning models using residual graphical analysis, test harness and k-fold cross validation techniques.
- Design and visualize interactive results using tableau to publish dashboards
Confidential, Las Vegas, NV
Security Data Scientist
- Developed a security model based on cyber-security analytics .
- Collected and modeled data to detect malware, their root causes and security incidences
- Proposed statistical analysis and visualizations using R
- Visualization of network behavior and patterns to understand source of the threats
- Interpret data, analyze results using statistical techniques , provide ongoing reports and provide feedback to the data.
- Worked on ETL package that included data conversions, dynamic variable expressions, sequence containers, conditional data flow using SSIS.
- Hands-on experience PL-SQL , SSIS , Business Intelligence in an SQL Server environment
- Designed and shared code using R and SQL extractions from relational database, data cleaning, data analysis, predictive analytics and graphic visualization.
- Developed a shiny app using information security data with in depth analytics to visualize intensity of a hack
- Data analytics on security using R language to predict cyber-attacks with respect to location and visualization using Tableau .
- Effectively used data blending feature and worked on Server installation and administration in tableau
- Experience with preparing dashboards using calculations, parameters, calculated fields and sets in Tableau.
Senior Sourcing Engineer
- Analyzed historical cost, service, material and other data to provide key insights to review financial goals, identify cost variances, save on opportunities and report on initiatives
- Used oracle tools and applications to develop and present metrics/reports to leaders and stakeholders of varying levels of technical and business knowledge
- Implemented six-sigma to optimize supply chain processes
- Developed cross functional expertise by working in the areas of purchase planning, contracts and procurement, sourcing and product life cycle management.
- Excellent working knowledge on Microsoft excel and use of functions like VLOOKUP , concatenate , lower, upper, proper and if functions.
- Performed data cleaning by removing duplicates, text columns and using mark check box in spread sheets.
- Performed various data analysis by drawing inference from data using pivot table and creating charts in Microsoft excel .
- In depth knowledge on oracle database and oracle e-business suite .