Manager - Data Science Resume
Seattle, WA
SUMMARY:
- Results - oriented, visionary Sr. Data Scientist with ~6 years’ experience in machine learning and deep learning and Masters’ in Business Analytics degree with focus on Statistics & machine learning.
- Implemented complex cloud-based AI/ML pipelines that are scalable, easy to maintain and effective in terms of cost and performance.
- Led teams in implementing cutting-edge solutions, provide thought leadership and prototyping enterprise data science solutions.
- Experience working in marketing, financial, insurance, media and healthcare industries.
- Led teams of sizes 3-10 people (onsite and offshore) providing though leadership, modeling and mentorship to more than 20 projects.
- Experience architecting real-time Big Data systems using on-premise and cloud solutions such as S3, Hadoop, EMR, Spark, Lambda, Quick Sight, Aurora, Glacier, MongoDB, Cassandra, PrestoDB, Hive, Kafka, sqoop, Elastic Search .
- Evaluated 3rd party data vendors and acquired data to increase model accuracy.
- Worked closely with business, data governance, SMEs and vendors to define data requirements.
- Built models such as LSTM, keras on Tensor Flow, MinMax, HMM, logistic regressions, Random Forests, SVM, k-NN, time series models using packages such as ggplot, dplyr, numpy, sci-kit learn, pandas, matplotlib, etc.
- Experience building NLP models using word embedding, Bag of n-grams, genism, word2vec.
- Automated by building workflows to extract data from various REST APIs and databases, processing responses, data transformations in python and R.
- Established feedback loops, automated processes, platform integrations, optimization models, models to increase user experience.
- Reported analytical findings to C-level executives using dashboards built in Tableau, Qlik View, R-Shiny.
- Managed teams to perform data analysis on classification and forecast models, statistical models, risk analysis and solved data driven problems using SPSS, SAS E-Miner, R, SAS, Python, E-Views, Tableau, Qlik.
- Published Tableau reports to clients on weekly basis and presented monthly graphical summary to clients.
TECHNICAL SKILLS:
PROGRAMMING LANGUAGES: Python, R, SAS, C, Matlab, Java, SQL, Hive, Linux, VBA Macro, Linux, HTML, CSS, JavaScript, and Bootstrap.
TOOLS: RStudio, python, Spark, AWS, SPSS, SAS, Hadoop, Hive, MongoDB, Cassen dra, Zeppelin, S3, Aurora, Glacier, Elastic Search, EC2, Lambda, Quick Sight, Tableau, Qlik, Adobe Site Catalyst, Google Analytics, MS Visual Studio, Excel, MS PowerPoint.
EXPERIENCE:
Confidential, Seattle, WA
Manager - Data Science
Responsibilities:
- Manage ad-hoc analytical requests from Flights, hotels, cars stake holders.
- Add ML model improvements and new feature additions to Krazyglue (LOB recommendation engine of Confidential ) and perform ML model improvements.
- Collaborate with UI and engineering team for new ML feature roll-outs and orchestrate AB testing to report and track performance.
- Find answers to key business questions by building complex queries in Teradata and Hive.
- Analyze website traffic data to Brand Confidential, Hotels, Hotwire, Travelvelocity, Orbitz, trivago websites and generate dashboards in Adobe Analytics.
Confidential, Sunnyvale- CA
Senior Data Scientist
Responsibilities:
- Extracted inventory flow and stock level data across various nodes (hubs, stores, etc) by joining tables from more than 10 databases.
- Assessed the scope and scale of project based on current and future scope of project.
- Implemented hybrid architecture using Hive, PrestoDB, MariaDB, AWS Aurora, S3, Glacier, Spark, EMR.
- Forecasted Average weekly demand using historical demand data and calculated safety stock, cycle stock and max stock across all nodes (Hubs, stores, etc) based on predictions.
- Built MinMax, LSTM, ARIMA, Naive machine learning and deep learning models and sent results back to local databases.
- Build Tableau dashboards for ad-hoc analysis and to compare results.
Tools: AWS Aurora, S3, Glacier, EMR, Python, PySpark, Rest API, Linux, Hive, PrestoDB, MariaDB, Tableau.
Confidential
Independently architect
Responsibilities:
- Architect infrastructure by gathering scope and scale of the project.
- Generated faulty device definition based on analytical and business discussions (no standard definition)
- Extract data to identify device flows across various nodes and defined flow direction based by creating business rules. Computed parameters based on customer flows of each device.
- Build machine learning models using SVM, Logistic regression, Random forest algorithms to predict devices likely to be Confidential .
- Established a modeling strategy to validate our definition of lemon.
Tools: Python, PySpark, MariaDB, MongoDB, SQL Server, Hive, Tableau
Confidential, San Jose- CA
Senior Data Scientist
Responsibilities:
- Data Pipeline - Established data pipeline by merging data from REST APIs, PySpark, Oracle, Hadoop and MongoDB.
- Dashboard & Reporting - Built a central dashboard to visualize analytical findings and model monitoring system for developers and business users using Tableau.
- Deep Learning tool - Build Artificial Neural Nets ML tool to predict hardware failures of server using and keras API on TensorFlow and supporting tool using clustering and HMM.
- Proposed Architecture - Established infrastructure to perform faster computation by parallel computing and distributed computing using Hue, Zeppelin, Hadoop, python, pandas and Linux.
Tools: Python, Mongo DB, Spark, Tableau, Linux, Hive, REST API, Hue, Hadoop, Elastic Search.
Confidential, Madison, WI
Lead Data Scientist
Responsibilities:
- Modelling - Apply machine learning to advisor performance data and formulated business rule for high and medium performing advisors.
- Model interpretations - Deduced treatment plans for advisors by inspiring from interactions of high performing advisors and provided best practice recommendations to business.
- Implementation plan - Prioritized leads and created nurturing journey based on learnings from previous modelling practices.
- Automated data enrichment process by building web crawlers to extract data from advisors websites.
- Reporting - Coordinated with visualization team to create dashboard to compare and monitor performance of models.
- Vendor partnerships - Identified data vendors to optimize the performance of existing models and pave way to build new models.
- Project Management - Successfully built and refined models at scale using Agile framework (Scrum)
Tools: R, Python, SAS, SQL, Google Analytics, Tableau, Hadoop, Hive.
Confidential
Big Data Scientist/Analyst
Responsibilities:
- We launched predictive leads for sales and established predictive feedback loop to increase efficiency of predictive models across US, Asia-Pacific and Greater China regions.
- Build Central Tableau dashboard to compare the performance of predictive models across all regions.
- Work with campaign managers to understand their campaign targets, determine the size of data necessary to meet targets and provide analysis on various conversion metrics.
- Automated digital marketing campaign creation process & reduced manual hours by 80%.
- Processed data close to 15 million rows, discovered trends in data and analyzed ways to utilize data to solve complex data driven problems to support various use-cases by extracting data from SQL.
- Track down customer journey from AQL (Auto Qualified Leads) to SQL. Analyze different reasons to accept/reject as MQL and strategize ways to utilize rejected data.
Confidential
Data Scientist Intern
Responsibilities:
- Established an automation to extract huge corpora of text from blogs, news feed, tweets and other data sources.
- Used ggplot2, dplyr, lm, e1071, rpart, random Forest, nnet, tree packaged in R to build predictive models for Insurance and healthcare clients and successfully incorporated models into workflows.
- Built predictive models to predict churn of customers & ‘Next product to buy’ models using logistic regression and neural nets respectively using R, Python, SPSS and SAS.
- Managed many analytical projects in parallel such as build predictive models, optimization models, unstructured data analysis, data graphs.
- Extracted social media data, crunched and built word clouds, data graphs and storyboards using SAS E-Miner. Provided in-depth story analysis and provided recommendations.
- Helped companies with social media analysis, performed text mining, NLP, sentiment analysis and presented the results using Link Graphs in SAS E-Miner.
Tools: Python, R, SAS, SAS E-Miner, Tableau, Google Analytics, Linux, Hive, Hadoop, Elastic Search.
Confidential
Data Scientist Intern
Responsibilities:
- Increased rate by 20% and saved 30% of promotional fund by building high customer intent model
- Assisted the GIS strategic initiatives team in building predictive and prescriptive models using Excel, R and python.
- Analyzed LTV of customers and created decile groups of most profitable customers in each insurance premium group, designed promotions geared towards the target groups.
- Built internal business intelligence metrics such as retention rate, Life Time Value (LTV), CTRs with popular data visualization package (Tableau dashboard, R Charts, R Shiny dashboard and ggplot2).
- Extracted data from SQL and NoSQL databases such as MySQL, Oracle SQL and Hadoop using Hive queries; analyzed data, provided actionable insights and created managerial reports using Tableau.