BME 6780: Data Science for Bioengineers
- For postdoctoral, graduate, and advanced undergraduate students in Engineering, Sciences, and Medicine, and professionals in industry.
- Fall 2023, Mondays and Wednesdays 11:50am–1:10pm, LCB 115 and Zoom; office hours by request or Wednesdays 11:00am, WEB 3803.
Prerequisites: Some experience programming and instructor approval.
100% grade = 30% labs, 30% class project, 30% presentation, 10% class participation; late assignments are not accepted; class attendance is required.
Topics:
We will cover concepts in data science and machine learning, and their applications to discovery of principles from biomedical data.
- Databases, e.g., the Cancer Genome Atlas (TCGA) at the Genomic Data Commons (GDC).
- Data types, from, e.g., omics, imaging, and patient clinical information to, e.g., tissue samples and model organisms and systems.
- Algorithms, from the singular value decomposition (SVD) and principal component analysis (PCA) to multi-tensor decompositions, neural networks, and deep learning.
- Applications toward a better understanding of biology, e.g., the Luria-Delbrück experiment, and a better practice of medicine, e.g., personalized cancer diagnostics, prognostics, and therapeutics.
Skills:
- Proving mathematical theorems and programming symbolic computations.
- Designing algorithms and programming numerical computations.
- Working with databases and modeling biomedical data.
Activities:
- In-class presentations of scientific journal articles and patents.
- Participation in guest lectures and seminars on campus and discussions of conference reports.
- End-of-class celebration.
Readings on the SVD and deep learning:
- Syllabus
- COVID-19
- Fall 2023 Calendar
- Safety
- Health, Wellness, and Counseling
- Student Code
August 21:
August 23:
Lab 1:
Code the SVD or the tensor SVD of synthetic data and its visualization. Test and debug your code.
August 28:
Numerical Linear Algebra, Trefethen and Bau, III (1997).
Notebook 1: Computation and Visualization of the SVD
February 29:
Matrix Computations, Golub and Van Loan (1996).
August 30:
Composition and decomposition of synthetic data:
Notebook 2: The SVD of Synthetic Data
September 4:
September 6:
Testing and debugging your SVD code
September 11:
Mathematics of a tensor SVD, the higher-order SVD (HOSVD):
September 13:
Computation of the HOSVD:
Notebook 3: The tensor SVD of Synthetic Data
September 18:
More examples of the SVD of measured data:
Paper 5: Singular Value Decomposition for Genome-Wide Expression Data Processing and Modeling, Alter et al., Proceedings of the National Academy of Sciences (PNAS) USA (2000).
Patent 1: Method for Node Ranking in a Linked Database, Page, United States Patent (2001).
Paper 6: A Rapid Genome-Scale Response of the Transcriptional Oscillator to Perturbation Reveals a Period-Doubling Path to Phenotypic Change, Li and Klevecz, Proceedings of the National Academy of Sciences (PNAS) USA (2006).
Paper 7: Coordinated Metabolic Transitions During Drosophila Embryogenesis and the Onset of Aerobic Glycolysis, Tennessen, Bertagnolli et al., G3: Genes, Genomes, Genetics (2014).
September 20:
September 25:
September 27:
Lab 2:
Compute and visualize the SVD or the tensor SVD of your data. Interpret your data based upon its SVD or its tensor SVD. Use at least two different approaches each for preprocessing and sorting your data and for assessing the statistical significance of your interpretation.
From the SVD to PCA:
October 2:
The SVD is used for the stable computation of PCA:
October 4:
Notebook 4: The Hypergeometric Probability Distribution and P-Value
October 9:
October 11:
October 16:
October 18:
American Association for Cancer Research (AACR) Special Conference in Cancer Research: Brain Cancer
October 23:
October 25:
October 30:
Mathematical variations on the SVD and PCA for blind source separation (BSS):
November 1:
Notebook 5: The tensor SVD of Measured Data
November 6:
National Cancer Institute (NCI) Joint Meeting of the Cancer Systems Biology Consortium (CSBC) and the Physical Sciences in Oncology Network (PS-ON)
November 8:
National Cancer Institute (NCI) Joint Meeting of the Cancer Systems Biology Consortium (CSBC) and the Physical Sciences in Oncology Network (PS-ON)
November 11:
2023 University of Utah Engineering Day
November 13:
Supercomputing 2023 (SC23) 9th National Cancer Institute (NCI) Computational Approaches for Cancer Workshop (CAFCW)
November 15:
More examples of the HOSVD of measured data:
Paper 24: A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data from Different Studies, Omberg et al., Proceedings of the National Academy of Sciences (PNAS) USA (2007).
Paper 25: Characterizing the Evolution of Genetic Variance Using Genetic Covariance Tensors, Hines et al., Philosophical Transactions of the Royal Society B Biological Sciences (2009).
Paper 26: Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation, Li et al., Public Library of Science (PLoS) Computational Biology (2011).
Paper 27: MultiFacTV: Module Detection from Higher-Order Time Series Biological Data, Li et al., BMC Genomics (2013).
Paper 28: Subgraph Augmented Nonnegative Tensor Factorization (SANTF) for Modeling Clinical Narrative Text, Luo et al., Journal of the American Medical Informatics Association (2015).
November 20:
November 22:
Verification and validation
November 23:
November 27:
The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:
November 29:
"State of the project" presentations
December 4:
"State of the project" presentations
December 6:
"State of the project" presentations
End-of-class celebration!
Happy Winter Break!
May be pivotal for your career: