Data Scientist Resume
San Diego, CA
EXPERTISE AREAS:
Data science, hands - on coder, temporal and log analysis, SQL, NoSQL, visualization, time-motion, operations research, UX, econometrics, algorithm, machine learning, computational statistics, Mathematica, AWS.
EXPERIENCE:
Data Scientist
Confidential, San Diego, CA
Responsibilities:
- Authoring book/videos: "Functional Data Workflow: Practical Methods with a Time - Motion UX Case Study"
- Defining a business model for project-oriented coding camp/data incubator in collaboration with Wolfram Research
- Developing content for "Machine Learning in Finance" course for Fitch Learning
Consultant
Confidential, Franklin, TN
Responsibilities:
- Full time consultant: data analytics, econometrics forecasting, business reporting, healthcare fintech startup
- Worked with COO on data analysis and machine learning development and application to operational data
- Implemented software pipelines for client - facing business reports (eg, projections, financial reserves)
- Responsible (with treasuer) for maintaining code, report integrity, financial reconciliation
- Responsible (with COO) for predictive model evaluation and optimization
- Data extracted from transactional healthcare claims and clinical electronic records
- Regression and time-series models trained on ~50M historical transactions (10-100GB scale)
Tools: Mathematica desktop, Mathematica Online on Amazon AWS; SQL
StatisticianConfidential, La Jolla, CA
Responsibilities:
- Co - lead (with PI) scientific research project plan & development: time-motion/UX/OR studies in clinical settings
- Responsible for securing total ~$10M in competitive federal funding through 3 multiyear (R01) studies
- Lead analyst: responsible for data management, security, pilot, dictionaries, methods, scientific communication
- Responsible (with project manager) for IRB protocol compliance
- Responsible for quarterly progress and final reports to funding agencies
- Supervised analysis team; trained and mentored assistants and analysts (dictionaries, code review)
- Developed and organized workshops to disseminate findings to technical and clinical staff ( Confidential, VA)
- Participated as SME in Veterans Affairs national initiatives (Hi2, Vista Evolution/eHMP) to redesign enterprise IT
- Defined requirements for commercial proposals (IBM, others) for VA Usability Analytics platform
- Helped prototype: Active Notes app combining medication order entry & documentation (VE/eHMP)
- Led "Clinical Event Browser" app prototype and visualization study funded by VA Human Factors Engineering
- Developed a "Functional Data API" Application Programming Interface based on these studies/methods
Tools: Mathematica, Google and Dropbox cloud sharing, Excel, misc storage formats
Senior Statistician
Confidential
Responsibilities:
- Participated in I2B2 NLP (Natural Language Processing) project for processing temporal events in healthcare
- Developed graph theoretical methods and visualization go help guide project participants
Tools: Mathematica, Dropbox file sharing, misc formats
Statistician
Confidential
Responsibilities:
- Data Analyst/scientist on preliminary time - motion/UX study of doctor-patient communication in real-world outpatient settings based on video capture of visit activity
- Developed data analysis methods leading to scaled up studies
Tools: Mathematica, misc file formats
Postdoctoral Research
Confidential, La Jolla, CA
Responsibilities:
- Developed Survival models in oncology for select cancer types
- Responsible for model evaluation and selection, scientific communication
- Data in CSV/Excel spreadsheets extracted based on expert graded imaging and biomarkers feature modeling, dim reduction (tSVD) similarity search, k - means, graph-based clustering, regression using resampling (bootstrap analysis)
Tools: Mathematica, various data storage formats (Excel, tabular, SQL)
StatisticianConfidential
Responsibilities:
- Forensic fault detection in experimental communication systems
- Development network analysis methods, temporal clustering, sequence analysis
- Data extracted from logs of large scale disaster responses drills
