Data Analyst Resume
Bradenton, FL
SUMMARY
- Skillful Data Scientist with over 10 years of experience in Information Technology / Python / R/Looker/Tableau and dedicated 4.5 years of experience into data science, machine learning, natural language processing (NLP), & predictive modelling.
- Proficient in structured / unstructured data, data modeling, data mining, and data profiling.
- Successfully completed 5 data science projects
- Expert in data cleaning, features scaling, features engineering using Pandas and NumPy packages in python
- Extensive experience in developing different statistical machine learning, data mining solutions to various business problems and generated data visualizations using R / Python
- Proficient in machine learning algorithms and statistical modeling like decision trees, text analytics, natural language processing (NLP), supervised / unsupervised, regression models, social network analysis, neural networks, deep learning, SVM, and clustering using scikit - learn package in Python / R
- Well-versed in Predictive Modeling with SAS, R, and Python
- Expertise in MLlib, Spark's Machine learning library to build and evaluate different models
- Automated Data Import Script using shell scripting, PHP, MySQL and regular expressions.
- Skillful in Natural Language Processing (NLP) for speech recognition and used word to vec to understand the association of the words
- Worked on Jupyter notebook to visualize stores using maps
TECHNICAL SKILLS
- Data Science
- Python
- R
- MySQL
- Web analytics
- Clustering & Segmentation
- SQL
- Python
- RStudio
- PostgreSQL
- DataBricks
- PySpark
- GIS
- PostGIS
- Flask
- Databricks
- Integration Services (SSIS)
- Oracle 9i OLAP
- MS Office Web Components (OWC11)
- JDBC
- HTML5
- DHTML
- XML
- CSS3
- Web Services
- WSDL
- Erwin R 9.6 9.5 9.1 8. x
- Rational Rose
- MS Visio
- Spark peg
- Hive
- HDFS
- Map Reduce
- Pig
- Kafka
- SQL
- Hive
- Impala
- Pig
- Spark SQL
- SQLServer
- MySQL
- MS Access
- HDFS
- HBase
- Teradata
- Netezza
- MongoDB
- Cassandra
- SAP HANA
- MS Office (Word/Excel/Power Point/Visio)
- Tableau
- Rstudio Markdown
- Business Intelligence
- SSRS
- SVM
- GitHub
- Tableau
- Azure Data Warehouse
- Windows
- Linux
- Unix
PROFESSIONAL EXPERIENCE
Confidential
Data Analyst
Responsibilities:
- Analyzed data to tell astory (what happened, why it happened, what caused the numbers) and Recommended (what should the leadership do to improve)
- Performed data analysis to support and implement business processes and decisions
- Perform analytical deep dives to analyze problems and opportunities, identify the hypothesis and design & execute experiments and provide recommendations into key performance metrics and goals
- Designed, created, and maintained dashboards for KPI monitoring and identifying trends & discrepancies
- Developed a deep understanding of customer journey phases and key business metrics. Understand how and why customers engage with your product.
- Enhanced UX by using segment debugger and analyzing the API calls
- Utilized RedShift, SQL, Looker and tableau tool on a daily basis to provide analytics support across the business, including sales, product, marketing and executive teams
- Created efficient PDTs and improve run time of the looker dashboards
- Evaluated success of product initiatives by setting success criteria and analyzing results
- Provided analytics perspective to aid in product planning
- Good understanding of data architecture process
- Troubleshoot within backend developer tool to resolve BI issues
- Defined milestones, deliverables and communicated project scope, requirements
- Acted as liaison between clients and technical teams
- Developed communication plan and improved team collaboration between onsite and offshore teams
- Worked on ad hoc requests from internal clients.
- Documented functional and business requirements in collaboration with the product development team
- Worked with project or functional leads in developing use cases/ scenarios, flow diagrams and performing workflow analysis
- Managed cross-functional projects by providing project leadership and daily management throughout the project from inception to delivery
- Managed the feature development and defect backlog on a daily basis to ensure that priorities adhere to the strategic direction outlined by the Product Manager
- Work closely with Product Management group to provide data solutions to issues impacting customers
- Assisted with building training material content for new functionality developed for Collections Management
- Translated complex concepts into implications for the business via excellent communication skills, both verbal and written
- Extract meaningful insights through analyzing large, complex, multi-dimensional customer behavior data sets
- Identify key trends and build executive-facing dashboards to track the progress of acquisition, monetization, and engagement trends.
- Inform future experimentation design and roadmaps by performing exploratory analysis to understand user engagement behavior and derive insights
Confidential, Bradenton, FL
Data Scientist
Responsibilities:
- Performed data cleaning and feature selection using MLlib package in PySpark on Databricks
- Performed data cleaning, features scaling, features engineering using pandas and NumPy packages in Python
- Used NLP for sentiment analysis for the insurance and medical records
- Worked on Jupyter notebook to visualize stores using maps
- Built Factor Analysis and Cluster Analysis models using Python SciPy to classify customers into different target groups
- Used clustering techniques like DBSCAN, K-means, and Hierarchical clustering for customer profiling to design discount plans according to their behavior pattern in R and Python
- Performed spatial analysis using QGIS and manipulating the geometry types to visualize the data using shape files
- Used GIS (geographic information system) framework for gathering, managing, and analyzing data
- Generated visual maps using map projections in QGIS and used Confidential analytics
- Performed data preparation, loading, and management in an QGIS PostgreSQL server environment
- Developed good understanding of SRID, CRS, SRS, Vector Geometry, and Geodata (GeoJSON / WKB / WKT formats)
- Automated manual MS Excel tasks using macros which helped boost productivity
- Used VLOOKUP on raw excel file and other solver tools for better efficiency
- Created / designed reports that will use gathered metrics to infer & draw logical conclusions of past and future behavior
- Used containers like Docker for version control and seamless accessibility for clients
- Built Docker image to preprocess data and deploy model as an API
- Used R packages like BSTS (Bayesian structural time series), Boom, BoomSpikeSlab to infer Causal Impact and compare control group vs test group in a post period intervention in RStudio
- Generated graphs and reports using ggplot package in RStudio for analytical models
- Used RStudio Markdown for reporting and storytelling
- Worked on application of various machine learning algorithms and statistical modeling like decision trees, text analytics, natural language processing (NLP), supervised / unsupervised, regression models, social network analysis, neural networks, deep learning, SVM, clustering to identify volume using scikit-learn package in Python / R
- Used Principal Component Analysis in feature engineering to analyze high dimensional data
- Used NMF algorithm to produce latent features of an ACS dataset and plot it on a map to understand similarities between each state to further use it in optimized targeting
- Used Agile methodology and Scrum process for project developing
Confidential
Data Scientist - I
Responsibilities:
- Worked in a team of Data Scientists/ML Engineers to build and deliver multiple machine learning applications & data products with most of prototypes done in Python
- Developed multiple recommender systems that helps customer build customized vehicles, recommend warranty packages, parts, services, and used Clickstream data analysis
- Used Natural Language Processing (NLP) for speech recognition and used word to vec to understand the association of the words
- Used Jupyter notebook for Python coding
- Defined end-user experience and benefits individual recommender system
- Evangelized needs and benefits of recommender system to executive managements
- Presented business insights about the user behavior and product behavior
- Used macros for raw excel data files and made it useable for analytics
- Worked with team to develop state of the art recommender system using customer history /product history and used Kubernetes for scaling & development
- Evaluated multiple models based on affinity analysis, Deep & Wide Neural Network, ALS Matrix Factorization Model, Hybrid Collaborative Filtering with user based, and item-based models
- Formulated A/B testing metric and designed automated scheduled workflow to perform continuous A/B testing
- Deployed recommender models through web APIs to allow downstream applications for each new version of recommender system to easily consume recommendations
- Reduced custom built configuration time by 30% using the model
- Increased product (parts, warranty package) sales by 12% using the model
- Developed & deployed AI model that performed NLU (natural language understanding) on customer complaint and used vehicle configuration to predict repair hours, failed component/part, and rank procedures
- Performed data manipulation, data visualization, and feature engineering
- Performed feature engineering on text using word2vec model andhandled imbalanced class problem
- Built multiple ML models like Fully Connected Multi-Layer Neural Networks, CNN, ANN, RNN-GRU, Memory Network, Conv1D, and GLM
- Used libraries like TensorFlow and PyTorch(Rtorch with R)
- Evaluated model on multiple evaluation metrics like Top-K Mean Average Precision, KAPPA score, Precision,and Recall
- Predicted from more than 4.8K unique vehicle components/parts using model and achieved 82% accuracy on test set
- Deployed the model successfully on multiple downstream applications used by customer, dealership, technicians, and fleet managers
- Reduced average vehicle issue diagnosis time by 40% using model
Confidential
Junior Data Scientist
Responsibilities:
- Built Dynamic Forecasting Models across multiple dimensions served in real-time & batch process e.g. Weekly Vehicle Retail Forecast at models, categories, region & dealers, monthly/weekly warranty &parts forecast
- Identified the use case and laid out quantifiable benefits & clearly defined risks
- Developed / implemented R and Shiny application which showcases machine learning for business forecasting
- Performed validation on machine learning output from R
- Coordinated with Senior Director and Executives to present analytics workflow
- Co-led the team to identify machine learning approaches that could deliver quick results
- Evaluated merits of individual ML approaches and established evaluation metrics
- Built forecasting models like Forecast Hybrid, STL Arimax, RNN-LSTM,RNN-GRU, hierarchical &grouped time series, dynamic regression, vector autoregressions
- Formulated end-to-end solution and monitored team’s progress
- Delivered successful forecasting model with 85% accuracy for 12 month’s forecast horizons
Confidential
Ad-Ops Engineer
Responsibilities:
- Performed trafficking and delivery of all advertising related projects
- Managed digital advertising campaigns including Ad trafficking, packing, optimization, reporting, and monitor campaign progress
- Handled various tracking problems, ad tag implementation effectively
- Developed solutions / documented internal procedures, policies, and tutorials regarding ad tracking & targeting for DoubleClick, AdMob, AdSense / AdWords
- Worked with Product Management, Marketing, Engineering, and other teams to inform, test, and implement product and operational improvements
- Optimized campaign performance by recommending adjustments to inventory and creative components of the campaign
Confidential
Data Analyst
Responsibilities:
- Worked with data governance, data quality, data lineage, and data architect to design various models.
- Delivered MS Excel tasks using macros
- Analyzed large and complex datasets to investigate and identify fraud scheme trends and improve efficiency in fraud detection process.
- Classified fraud accounts to deconstruct emerging patterns.
- Built and managed reports and dashboards to monitor KPIs.
- Prepared technical design documents and test cases