



BIOEN 6900-003: Data Science for Bioengineers
- For postdoctoral, graduate, and advanced undergraduate students in Engineering, Sciences, and Medicine, and professionals in industry.
- Fall 2019, Mondays and Wednesdays 11:50am–1:10pm, LCB 115.
Prerequisites: Some experience programming and instructor approval.
100% grade = 30% labs, 30% presentation, 30% class project, 10% class participation; class attendance is required; late assignments are not accepted.
Topics:
We will cover concepts in data science and machine learning, and their applications to discovery of principles from biomedical data.
- Databases, from the Cancer Genome Atlas (TCGA) at the Genomic Data Commons (GDC) to the Utah Population Database (UPDB).
- Data types, from omics, imaging, and patient clinical information to biomedical samples and model organisms and systems.
- Algorithms, from the singular value decomposition (SVD) and principal component analysis (PCA) to multi-tensor decompositions, neural networks, and deep learning.
- Applications, from the Luria-Delbrück experiment to personalized cancer diagnostics, prognostics, and therapeutics.
Skills:
- Proving mathematical theorems and programming symbolic computations.
- Designing algorithms and programming numerical computations.
- Working with databases and modeling biomedical data.
Activities:
- In-class presentations of scientific journal articles and patents.
- Participation in guest lectures and seminars on campus and discussions of conference reports.
- End-of-class celebration.
- Syllabus
- Fall 2019 Calendar
- Safety
- Health, Wellness, and Counseling
- Student Code
August 26:
Numerical Linear Algebra, Trefethen and Bau, III (1997).
February 29:
August 28:
Mathematical properties of the SVD
In-Class Work on Lab 1:
Code the SVD of synthetic data and its visualization. Test and debug your code.
August 29, Thursday, 10:00–11:00am, WEB 3780, in lieu of any one Lab:
September 2:
September 4:
September 9:
Composition and decomposition of synthetic data:

September 11:
September 13, Friday, 8:00am–6:15pm, SMBB 2650, in lieu of any one Lab:
September 16:
Los Alamos National Laboratory, Sandia National Laboratories, NSF, and University of California San Diego Workshop on Artificial Intelligence and Tensor Factorizations for Physical, Chemical, and Biological Systems (Santa Fe, NM, September 17–20, 2019).
September 18:
National Cancer Institute (NCI) Physical Sciences in Oncology Symposium (Minneapolis, MN, September 18–20, 2019).
September 23:
September 25:
In-Class Work on Lab 2:
Compute and visualize the SVD of your data. Test and debug your code. Interpret your data based upon its SVD. Use at least two different approaches each for preprocessing and sorting your data and for assessing the statistical significance of your interpretation.
September 30:
October 2:
October 7 and 9:
October 21, Monday, 4:00–6:00pm, WEB 3780:
October 23:
More examples of HOSVD of measured data:
Paper 12: A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data from Different Studies, Omberg et al., Proc Natl Acad Sci USA (2007).
Paper 13: Characterizing the Evolution of Genetic Variance Using Genetic Covariance Tensors, Hines et al., Philos Trans R Soc Lond B Biol Sci (2009).
Paper 14: Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation, Li et al., PLoS Comp Bio (2011).
Paper 15: MultiFacTV: Module Detection from Higher-Order Time Series Biological Data, Li et al., BMC Genomics (2013).
Paper 16: Subgraph Augmented Nonnegative Tensor Factorization (SANTF) for Modeling Clinical Narrative Text, Luo et al., J Am Med Inform Assoc (2015).
October 24, Thursday, 8:30–11:00am, HSEB, in lieu of any one Lab:
October 28:
Computation of the HOSVD:

October 28, Monday, 4:00–6:00pm, WEB 3780:
October 30, Wednesday, 12:15–1:20pm, Intermountain Medical Center, 5121 S. Cottonwood St., Murray, in lieu of any one Lab:
Keynote of the 12th Annual Huntsman-Intermountain Cancer Care Program Conference
The Origins of Cancer
Robert A. Weinberg, Ph.D.
Ludwig Professor for Cancer Research and Member of the Whitehead Institute
Massachusetts Institute of Technology (MIT)
November 4:
From the SVD to PCA:

November 6:
November 11:
"State of the project" presentations
November 13:
"State of the project" presentations
November 18:
Tensor SVD of measured data:

Tensor SVD of synthetic data:

November 20:
The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:

November 25:
e-Guest Lecture:
Careers in Data Science: Making the World a Better Place
Sri Priya Ponnapalli, Ph.D.
Principal Data Scientist and Sports ML Manager, Amazon AI (Palo Alto, CA),
Faculty, Rutgers Business School (Newark, NJ), and
CEO and Co-Founder, Eigengene, Inc. (Palo Alto, CA)
November 27:
Mathematical variations on the SVD and PCA:

November 28:
December 2:
The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:

December 4:
End-of-class celebration!
Project update presentations
Happy Winter Break!