We provide IT Staff Augmentation Services!

Data Scientist/data Analyst Resume

Oakland, CA


  • Data scientist who takes pride in building models that translate data points into business insights. Experienced executing data driven solutions to increase efficiency, accuracy and utility of data processing. Experienced at creating data regression models, predictive data modelling and analyzing data mining algorithms to deliver insights and implement performance oriented solutions to complex business problems.
  • Data and Quantitative AnalysisBig Data Queries and Interpretation
  • Decision Analytics
  • Data Mining and Visualization Tools
  • Predictive Modeling
  • Machine Learning Algorithms
  • Data Governance and Management
  • Business Intelligence


Languages: C, Python, SQL, PL/SQL, R/Rstudio

Systems: Microsoft Windows, Linux

Databases: Oracle 12c/10g/11g, DB2, Pig, Hive, MySQL, MongoDB

Big data Technologies: Spark, Kafka, Hadoop

Other Tools: SAS, Tableau, Collibra

Productivity Software: Microsoft Access, Word, Excel, PowerPoint


Data Scientist/Data Analyst

Confidential, Oakland, CA

  • Extract/cleaned large volumes of healthcare data to understand the quality, completeness, and appropriate use of data across dozens of sources.
  • Worked with statisticians, business managers, and software engineers, to formulate and scope questions and translate knowledge into care transformation.
  • Established links across existing data sources; processes large data needed for complex research/operational studies.
  • Developed algorithms and predictive models to solve critical health service problems.
  • Created pipelines for data ingestion from various channels using Kafka.

Data Scientist/Data Analyst

Confidential, San Rafael, CA

  • Designed the Data Marts in dimensional data modeling using star and snowflake schemas.
  • Participated in all phases of data mining, data collection, data cleaning, developing models, validation, visualization and Gap analysis of data coming from sources using Spark, Python, Hive, Pig framework.
  • Created Logical and Physical dimensional modeling for Hadoop using Data schema methodologies.
  • Involved in Dataset/Data frames/RDDs creation and transferring the Warehouse data into HDFS file system.
  • Involved in migrating the Projects into Hadoop platform and experience in Hadoop cluster migration activities.

Data Scientist/Data Analyst

Confidential, San Rafael, CA

  • Provided statistical programming and analysis for the Healthcare Policy and Claims Research.
  • Conducted qualitative and qualitative studies including: case - control, cross-sectional, meta-analysis, retrospective studies.
  • Developed SAS macros, templates and utilities for data cleaning and reporting.
  • Perform statistical analysis of data using SAS procedures and R packages.
  • Created SAS programs to generate tables, listings, and figures and analysis datasets.

Data Governance Analyst

Confidential, San Jose, CA

  • Participated in the Data Governance working group sessions to create Data Governance Policies.
  • Gathered requirements by working with the business users on Business Glossary, Data Dictionary, Reference data, and Policy Management.
  • Performed an end to end Data Lineage assessment and documentation for select LOBs.
  • Linked data lineage to data quality and business glossary work within the overall data governance program.
  • Overseen the configuration of Collibra for reference and master data domains.

Data Analyst - Intern

Confidential, Sacramento, CA

  • Implemented Excel, SAS and SQL extensively for raw data collection, procured and formatted data from data warehouses according to in-house standard convention for data processing.
  • Built dashboards using Tableau to provide financial data across the state departments to gain insights about financial data.
  • Cleaned and transformed the data to be fed into the data analytics engine using python along with MongoDB and MySQL databases.
  • Participated in requirements meetings and data mapping sessions to interpret business strategy needs.
  • Procured data from document retention systems for data analysis using inferential statistics, in R/Python.

Hire Now