- For postdoctoral, graduate, and advanced undergraduate students in Engineering, Sciences, and Medicine, and professionals in industry.
- Fall 2019, Mondays and Wednesdays 11:50am–1:10pm, LCB 115. Prerequisites: Some experience programming and instructor approval.
- Databases, from the Cancer Genome Atlas (TCGA) at the Genomic Data Commons (GDC) to the Utah Population Database (UPDB).
- Data types, from omics, imaging, and patient clinical information to biomedical samples and model organisms and systems.
- Algorithms, from the singular value decomposition (SVD) and principal component analysis (PCA) to multi-tensor decompositions, neural networks, and deep learning.
- Applications, from the Luria-Delbrück experiment to personalized cancer diagnostics, prognostics, and therapeutics.
- Proving mathematical theorems and programming symbolic computations.
- Designing algorithms and programming numerical computations.
- Working with databases and modeling biomedical data.
- In-class presentations of scientific journal articles and patents.
- Participation in guest lectures and seminars on campus and discussions of conference reports.
- End-of-class celebration.
- Syllabus
- Fall 2019 Calendar
- Safety
- Health, Wellness, and Counseling
- Student Code

100% grade = 30% labs, 30% presentation, 30% class project, 10% class participation; class attendance is required; late assignments are not accepted.

Topics:

We will cover concepts in data science and machine learning, and their applications to discovery of principles from biomedical data.

Skills:

Activities:

August 26:

- Welcome!

- How Bright Promise in Cancer Testing Fell Apart,

- The SVD in the news:

If You Liked This, You're Sure to Love That,

- PCA for face recognition:

Paper 1: Low-Dimensional Procedure for the Characterization of Human Faces, Sirovich and Kirby,

Paper 2: Eigenfaces for Recognition, Turk and Pentland,

- Mathematics of the SVD:

- Notebook 1: Computation and Visualization of the SVD

Mathematica Code: Notebook_1.nb

February 29:

- Gene H. Golub's Birthday!

Paper 3: Calculating the Singular Values and Pseudo-Inverse of a Matrix, Golub and Kahan,

August 28:

- Mathematical properties of the SVD

- In-Class Work on Lab 1:

Code the SVD of synthetic data and its visualization. Test and debug your code.

August 29, Thursday, 10:00–11:00am, WEB 3780, in lieu of any one Lab:

- Screening of the Amazon Web Services (AWS) Public Sector Webinar

NIH’s Strategic Vision for Data Science: Enabling a FAIR-Data Ecosystem

Susan Gregurick, Ph.D.

Director, Division of Biophysics, Biomedical Technology, and Computional Biosciences

National Institute of General Medical Sciences (NIGMS)

September 2:

- Happy Labor Day!

September 4:

- SVD of Synthetic Data:

- Notebook 2: SVD of Synthetic Data

Mathematica Code: Notebook_2.nb

September 9:

- Composition and decomposition of synthetic data:

September 11:

- Slides 1: Examples of SVD of measured data

- More examples of SVD of measured data:

Paper 4: Singular Value Decomposition for Genome-Wide Expression Data Processing and Modeling, Alter et al.,

Patent 1: Method for Node Ranking in a Linked Database, Page,

Paper 5: A Rapid Genome-Scale Response of the Transcriptional Oscillator to Perturbation Reveals a Period-Doubling Path to Phenotypic Change, Li and Klevecz,

Paper 6: Coordinated Metabolic Transitions During

September 16:

September 18:

September 23:

- Lab 1 Due In-Class

September 25:

- In-Class Work on Lab 2:

Compute and visualize the SVD of your data. Test and debug your code. Interpret your data based upon its SVD. Use at least two different approaches each for preprocessing and sorting your data and for assessing the statistical significance of your interpretation.

October 7 and 9:

- Happy Fall Break!

October 23:

- Lab 2 Due In-Class

- Mathematics of a tensor SVD, the higher-order SVD (HOSVD):

Paper 7: A Multilinear Singular Value Decomposition, De Lathauwer et al.,

- Slides 2: Examples of the HOSVD of measured data

- More examples of HOSVD of measured data:

Paper 8: A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data from Different Studies, Omberg et al.,

Paper 9: Characterizing the Evolution of Genetic Variance Using Genetic Covariance Tensors, Hines et al.,

Paper 10: Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation, Li et al.,

Paper 11: MultiFacTV: Module Detection from Higher-Order Time Series Biological Data, Li et al.,

Paper 12: Subgraph Augmented Nonnegative Tensor Factorization (SANTF) for Modeling Clinical Narrative Text, Luo et al.,

October 25:

- Computation of the HOSVD:

November 4:

- Selection of a cutoff of the singular values:

Paper 13: Component Retention in Principal Component Analysis with Application to cDNA Microarray Data, Cangelosi and Goriely,

Paper 14: The Optimal Hard Threshold for Singular Values is 4/√3, Gavish and Donoho,

- Robust PCA and removal of outliers:

Paper 15: Sparsity Control for Robust Principal Component Analysis, Mateos and Giannakis,

Paper 16: Robust Principal Component Analysis? Candès et al.,

- GSVD or two-matrix SVD:

Paper 17: Generalized Singular Value Decomposition for Comparative Analysis of Genome-Scale Expression Datasets of Two Different Organisms, Alter et al.,

Paper 18: Mathematically Universal and Biologically Consistent Astrocytoma Genotype Encodes for Transformation and Predicts Survival Phenotype, Aiello et al.,

- Tensor GSVD or two-tensor SVD:

Paper 19: GSVD- and Tensor GSVD-Uncovered Patterns of DNA Copy-Number Alterations Predict Adenocarcinomas Survival in General and in Response to Platinum, Bradley et al.,

Paper 20: TNF-Insulin Crosstalk at the Transcription Factor GATA6 is Revealed by a Model that Links Signaling and Transcriptomic Data Tensors, Chitforoushzadeh et al.,

November 6:

- "State of the project" presentations

November 10:

- "State of the project" presentations

November 18:

- Tensor SVD of measured data:

- Notebook 3: Tensor SVD of Measured Data

Mathematica Code: Notebook_3.nb

November 20:

- Tensor SVD of synthetic data:

- Notebook 4: Tensor SVD of Synthetic Data

Mathematica Code: Notebook_4.nb

November 21:

- Happy Thanksgiving!

November 25:

- From the SVD to PCA:

- Slides 3: The SVD vs. PCA

Paper 21: Correspondence Analysis Applied to Microarray Data, Fellenberg et al.,

November 27:

- Mathematical variations on the SVD and PCA:

- Independent component analysis (ICA):

Paper 22: Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images, Olshausen and Field,

Paper 23: The "Independent Components" of Natural Scenes are Edge Filters, Bell and Sejnowski,

Paper 24: Linear Modes of Gene Expression Determined by Independent Component Analysis, Liebermeister,

- Nonnegative matrix factorization (NMF):

Paper 25: Learning the Parts of Objects by Non-Negative Matrix Factorization, Lee and Seung,

Paper 26: Metagenes and Molecular Pattern Discovery Using Matrix Factorization, Brunet et al.,

December 2:

- The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:

- Neural networks:

Paper 27: Predicting Human Brain Activity Associated with the Meanings of Nouns, Mitchell et al.,

Paper 28: Integrating Multiple-Study Multiple-Subject fMRI Datasets Using Canonical Correlation Analysis, Rustandi et al.,

- Readings on deep learning:

December 4:

- End-of-class celebration!

- Project update presentations

Happy Winter Break!

- DNA from xkcd

See you in Spring 2020 in BIOEN 6770: Genomic Signal Processing