We provide IT Staff Augmentation Services!

Data Science Analyst Resume

4.00/5 (Submit Your Rating)

Denver, CO

SUMMARY

  • Data Analyst with 5 years of experience in data mining, machine learning & predictive analytics for driving business solutions.
  • Professional working experience in Data Analytics using Microsoft SQL Server, Spark SQL, PowerBI, Data Lake, Stream Analytics, Application
  • Insights Analytics, Regression and Excel.
  • Advanced experience in Business Objects, SQL, and Microsoft Applications, including Excel and Access.
  • Have good experience in writing complex and business centric stored procedures, analytics queries (especially in PowerBI) etc.
  • Experience in data mining using Spark, Hive SQL.
  • Knowledge of converting telemetry data into meaningful business insights using PowerBI.
  • Experience with analyzing online user behavior, Conversion Data (A/B Testing) and customer journeys, funnel analysis.
  • Deep understanding of Software Development Life Cycle (SDLC) as well as Agile/Scrum methodology to accelerate Software Development iteration.
  • Hands on experience in writing queries in SQL and R to extract, transform and load (ETL) data from large datasets using Data Staging.
  • Extensive experience in creating rich visualizations in dashboards using Tableau Dashboard and prepared user stories to create compelling dashboards to deliver actionable insights.
  • Experience in designing techniques like Snowflake schemas, Star schema, fact and dimension tables, logical and physical modeling.

TECHNICAL SKILLS

Tools: & Technologies: Python (SciPy, NumPy, Pandas, Scikit - learn, Jupyter notebook), C, R Programming, Shell Scripting, SQL (Advanced), Object Oriented Program- ming (C++/Java), JSON, Web Scraping, SAS (E-Miner & E-Guide), SSIS, SSRS, Apache Spark, Hadoop, Oracle DB, Excel(Advanced), Tableau, JMP Pro, Power BI, Rally, Jira, Adobe Analytics, Google Analytics.

Data Science: Supervised learning, unsupervised learning, Feature Engineering, Text Analytics

Math & Statistics: Binomial and Multinomial distributions, Univariate and Multivariate Statistics, Sampling, Hypothesis Testing, Confidence Intervals, Naïve Bayes, Likelihood functions, Probabilistic Classifiers etc.

Selected coursework: Statistics for Data Science, Computing & Statistics, Applied Machine Learning, Advanced Business Analytics, Cloud Computing, Optimization Techniques, Project & Operations Management, Supply Chain Management, Social Media Marketing and Analysis.

PROFESSIONAL EXPERIENCE

Confidential, Denver, CO

DATA SCIENCE ANALYST

Responsibilities:

  • Used SQL to extract and transform the data from Freud environment and load the structured data into the Smart Care environment. This reduced the service calls from 6000 to 4000.
  • Conducted queries via Partners EHR/EMR system and output in SQL Server database as part of Readmission Project.
  • Developed algorithm to convert insurance-orientated ICD-9 codes to clinical practice meaningful disease classification using Python.
  • Conducted data analysis using logistical model, KNN and random forest method to identify high readmission risk patient and improved the
  • Accuracy (C-scores) by 30 percent.
  • Extracted twitter data using Python and did text mining analysis with BeautifulSoup (Python) and SAS E-Miner to improve the AHN facilities which increased the occupancy rate by 12%.
  • Built complex SQL reports to audit $2.5 million of pay and insurance benefits for over 150 individual records.
  • Designed and developed various analytical reports from multiple data sources by blending data on a single Worksheet in Tableau Desktop.
  • Involved in the planning phase of internal & external table schemas in Hive with appropriate static and dynamic portions for efficiency.

Confidential, Lake Shore, MN

DATA ANALYST

Responsibilities:

  • Responsible for analytical data needs, handling of complex data requests, reports, and predictive data modeling.
  • Designed ad-hoc queries with SQL in Cognos ReportNet. Examined reports and presented findings in PowerPoint and Excel.
  • Used Anomaly & Fraud detection techniques with SAS E-Miner for the Confidential client resulting in reduction of 22% of fraudulent cases.
  • Reporting of frauds, missed transactions, forecast, user behavior using Tableau in direct weekly cross-functional team meetings for continuous process improvement.
  • Implemented Agile Scrum practices for project implementation which reduced the project touch time by 300 man-hours and cost reduction of $30,000/year.
  • Conducted statistical analysis to leverage the results to drive brand decision making and survey development resulting in 4 new projects from business partners
  • Evaluated performance of 300+ stores for Nielsen clients based on key metrics and identified opportunities to enable stores to meet and exceed their financial targets through increased sales.
  • Created Data Lake by extracting customer’s data from various data sources into HDFS. This includes data from Teradata, Mainframes, RDBMS, CSV and Excel.
  • Involved in optimization of SQL scripts and designed efficient queries to query data
  • Developed the SQL table schema for the effective storage of the customer data.
  • Involved in preparing design and unit and Integration test documents
  • Developed an internal web-scraper tool for inspection of ad-hosting on websites using google, URLLib, Beautiful Soup packages in python.
  • Two Sigma Financial modeling: Worked on a Kaggle challenge of leveraging data science and business analytics tools in financial market as a semester project. Coding is done in Python and visualizations in Power BI.
  • Fraud Detection: Anomaly detection method was used to detect outliers in financial transactions for the Confidential clients applying the logic on the Spark framework which allows for large scale data processing. Anomalies detected are reported to the downstream teams for further action on the client account.
  • Hospital Readmission Project: Taking Hospital readmission dataset, after visualizing the patterns in Tableau and built a predictive model in R to predict readmission risk..
  • Behavioral Analysis: Analyzed more than 100K Patient Records for early readmission risk using Py-Spark and Spark Machine Learning Library (MLlib).
  • Surgical Schedule Optimization: Designed optimal surgical scheduling and staff planning for Medical College by building generalized linear model and using AMPL optimization tool, this helped in 10% reduction in the under allocated operating hours.
  • Revenue Analysis: Worked on movie revenue data sets and devised a dynamic forecasting model through Regression Stepwise, KNN and En- semble techniques. Average Ensemble model results are 92.3% accurate.
  • Twitter sentiment analysis using Python and NLTK:Implemented sentiment analysis of the tweets (mobile carriers) using NLTK sentiment analysis and twitter API

We'd love your feedback!