



BME 6780: Data Science for Bioengineers
- For postdoctoral, graduate, and advanced undergraduate students in Engineering, Sciences, and Medicine, and professionals in industry.
- Fall 2022, Mondays and Wednesdays 11:50am–1:10pm, LCB 115 and Zoom; office hours by request or Wednesdays 11:00am, WEB 3803.
Prerequisites: Some experience programming and instructor approval.
100% grade = 30% labs, 30% class project, 30% presentation, 10% class participation; late assignments are not accepted; class attendance is required.
Topics:
We will cover concepts in data science and machine learning, and their applications to discovery of principles from biomedical data.
- Databases, e.g., the Cancer Genome Atlas (TCGA) at the Genomic Data Commons (GDC).
- Data types, from, e.g., omics, imaging, and patient clinical information to, e.g., tissue samples and model organisms and systems.
- Algorithms, from the singular value decomposition (SVD) and principal component analysis (PCA) to multi-tensor decompositions, neural networks, and deep learning.
- Applications toward a better understanding of biology, e.g., the Luria-Delbrück experiment, and a better practice of medicine, e.g., personalized cancer diagnostics, prognostics, and therapeutics.
Skills:
- Proving mathematical theorems and programming symbolic computations.
- Designing algorithms and programming numerical computations.
- Working with databases and modeling biomedical data.
Activities:
- In-class presentations of scientific journal articles and patents.
- Participation in guest lectures and seminars on campus and discussions of conference reports.
- End-of-class celebration.
Readings on the SVD and deep learning:
- Syllabus
- COVID-19
- Fall 2022 Calendar
- Safety
- Health, Wellness, and Counseling
- Student Code
August 22:
August 24:
Lab 1:
Code the SVD or the tensor SVD of synthetic data and its visualization. Test and debug your code.
August 29:
Numerical Linear Algebra, Trefethen and Bau, III (1997).
February 29:
Matrix Computations, Golub and Van Loan (1996).
August 31:
Composition and decomposition of synthetic data:


September 5:
September 7:
Testing and debugging your SVD code
September 12:
More examples of the SVD of measured data:
Paper 5: Singular Value Decomposition for Genome-Wide Expression Data Processing and Modeling, Alter et al., Proceedings of the National Academy of Sciences (PNAS) USA (2000).
Patent 1: Method for Node Ranking in a Linked Database, Page, United States Patent (2001).
Paper 6: A Rapid Genome-Scale Response of the Transcriptional Oscillator to Perturbation Reveals a Period-Doubling Path to Phenotypic Change, Li and Klevecz, Proceedings of the National Academy of Sciences (PNAS) USA (2006).
Paper 7: Coordinated Metabolic Transitions During Drosophila Embryogenesis and the Onset of Aerobic Glycolysis, Tennessen, Bertagnolli et al., G3: Genes, Genomes, Genetics (2014).
September 14:
National Cancer Institute (NCI) Physical Sciences in Oncology Network (PS-ON) Virtual Symposium
September 19:
Mathematics of a tensor SVD, the higher-order SVD (HOSVD):

Computation of the HOSVD:


September 21:
September 26:
2022 Society for Industrial and Applied Mathematics (SIAM) Conference on Mathematics of Data Science (MDS22)
September 28:
Lab 2:
Compute and visualize the SVD or the tensor SVD of your data. Interpret your data based upon its SVD or its tensor SVD. Use at least two different approaches each for preprocessing and sorting your data and for assessing the statistical significance of your interpretation.
From the SVD to PCA:

October 3:
The SVD is used for the stable computation of PCA:

October 5:
October 10:
October 12:
October 17:
October 19:
October 24:
October 26:
Mathematical variations on the SVD and PCA for blind source separation (BSS):

October 31:
The tensor SVD of measured data:

November 2:
November 7:
November 9:
More examples of the HOSVD of measured data:
Paper 24: A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data from Different Studies, Omberg et al., Proceedings of the National Academy of Sciences (PNAS) USA (2007).
Paper 25: Characterizing the Evolution of Genetic Variance Using Genetic Covariance Tensors, Hines et al., Philosophical Transactions of the Royal Society B Biological Sciences (2009).
Paper 26: Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation, Li et al., Public Library of Science (PLoS) Computational Biology (2011).
Paper 27: MultiFacTV: Module Detection from Higher-Order Time Series Biological Data, Li et al., BMC Genomics (2013).
Paper 28: Subgraph Augmented Nonnegative Tensor Factorization (SANTF) for Modeling Clinical Narrative Text, Luo et al., Journal of the American Medical Informatics Association (2015).
November 14:
Supercomputing 2022 (SC22) 8th NCI Computational Approaches for Cancer Workshop (CAFCW)
November 16:
November 19:
2022 University of Utah Engineering Day
November 21:
Verification and validation
November 23:
The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:

November 24:
November 28:
"State of the project" presentations
November 30:
"State of the project" presentations
December 5:
"State of the project" presentations
December 7:
"State of the project" presentations
End-of-class celebration!
Happy Winter Break!
May be pivotal to your career: