- For postdoctoral, graduate, and advanced undergraduate students in Engineering, Sciences, and Medicine, and professionals in industry.
- Fall 2022, Mondays and Wednesdays 11:50am–1:10pm, LCB 115 and Zoom. Prerequisites: Some experience programming and instructor approval.
- Databases, e.g., the Cancer Genome Atlas (TCGA) at the Genomic Data Commons (GDC).
- Data types, from, e.g., omics, imaging, and patient clinical information to, e.g., tissue samples and model organisms and systems.
- Algorithms, from the singular value decomposition (SVD) and principal component analysis (PCA) to multi-tensor decompositions, neural networks, and deep learning.
- Applications toward a better understanding of biology, e.g., the Luria-Delbrück experiment, and a better practice of medicine, e.g., personalized cancer diagnostics, prognostics, and therapeutics.
- Proving mathematical theorems and programming symbolic computations.
- Designing algorithms and programming numerical computations.
- Working with databases and modeling biomedical data.
- In-class presentations of scientific journal articles and patents.
- Participation in guest lectures and seminars on campus and discussions of conference reports.
- End-of-class celebration.
- Syllabus
- COVID-19
- Fall 2022 Calendar
- Safety
- Health, Wellness, and Counseling
- Student Code

100% grade = 30% labs, 30% class project, 30% presentation, 10% class participation; late assignments are not accepted; class attendance is required.

Topics:

We will cover concepts in data science and machine learning, and their applications to discovery of principles from biomedical data.

Skills:

Activities:

Readings on the SVD and deep learning:

- Book 1:

Book 2:

August 22:

- Welcome!

August 24:

- Introduction:

How Bright Promise in Cancer Testing Fell Apart,

- The SVD in the news:

If You Liked This, You're Sure to Love That,

- PCA for face recognition:

Paper 1: Low-Dimensional Procedure for the Characterization of Human Faces, Sirovich and Kirby,

Paper 2: Eigenfaces for Recognition, Turk and Pentland,

- Lab 1:

Code the SVD or the tensor SVD of synthetic data and its visualization. Test and debug your code.

August 29:

- Mathematics of the SVD:

- Notebook 1: Computation and Visualization of the SVD

Mathematica Code: Notebook_1.nb

February 29:

- Gene H. Golub's Birthday!

Paper 3: Calculating the Singular Values and Pseudo-Inverse of a Matrix, Golub and Kahan,

August 31:

- Composition and decomposition of synthetic data:

- Notebook 2: The SVD of Synthetic Data

Mathematica Code: Notebook_2.nb

September 5:

- Happy Labor Day!

September 7:

- Testing and debugging your SVD code

September 12:

- Slides 1: Examples of the SVD of measured data

- More examples of the SVD of measured data:

Paper 5: Singular Value Decomposition for Genome-Wide Expression Data Processing and Modeling, Alter et al.,

Patent 1: Method for Node Ranking in a Linked Database, Page,

Paper 6: A Rapid Genome-Scale Response of the Transcriptional Oscillator to Perturbation Reveals a Period-Doubling Path to Phenotypic Change, Li and Klevecz,

Paper 7: Coordinated Metabolic Transitions During

September 14:

September 19:

- Mathematics of a tensor SVD, the higher-order SVD (HOSVD):

- Paper 4: A Multilinear Singular Value Decomposition, De Lathauwer et al.,

- Computation of the HOSVD:

- Notebook 3: The tensor SVD of Synthetic Data

Mathematica Code: Notebook_3.nb

September 21:

- Example of TCGA data:

Paper 8: Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas, TCGA Research Network,

- Example of interpretation of TCGA data:

Paper 9: Mathematically Universal and Biologically Consistent Astrocytoma Genotype Encodes for Transformation and Predicts Survival Phenotype, Aiello et al.,

September 26:

- In-Class Work on Lab 1

September 28:

- Lab 1 Due In-Class

- In-Class Work on Lab 2:

Compute and visualize the SVD or the tensor SVD of your data. Interpret your data based upon its SVD or its tensor SVD. Use at least two different approaches each for preprocessing and sorting your data and for assessing the statistical significance of your interpretation.

- From the SVD to PCA:

- Slides 2: The SVD vs. PCA

- Paper 10: Correspondence Analysis Applied to Microarray Data, Fellenberg et al.,

October 3:

- Examples of assessing the statistical significance of an interpretation:

Paper 11: Systematic Determination of Genetic Network Architecture, Tavazoie et al.,

Paper 12: Discovering Motifs in Ranked Lists of DNA Sequences, Eden et al.,

Paper 13: GOrilla: A Tool for Discovery and Visualization of Enriched GO Terms in Ranked Gene Lists, Eden et al.,

- Slides 3: The hypergeometric probability distribution and

- Notebook 4: The Hypergeometric Probability Distribution and

Mathematica Code: Notebook_4.nb

October 5:

- Lab 2 "Data Clinic"

- Selection of a cutoff of the singular values:

Paper 14: Component Retention in Principal Component Analysis with Application to cDNA Microarray Data, Cangelosi and Goriely,

Paper 15: The Optimal Hard Threshold for Singular Values is 4/√3, Gavish and Donoho,

- Robust PCA and removal of outliers:

Paper 16: Sparsity Control for Robust Principal Component Analysis, Mateos and Giannakis, in

Paper 17: Robust Principal Component Analysis? Candès et al.,

October 10:

- Happy Fall Break!

October 12:

- Happy Fall Break!

October 17:

- Slides 4: Examples of the HOSVD of measured data

- More examples of the HOSVD of measured data:

Paper 18: A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data from Different Studies, Omberg et al.,

Paper 19: Characterizing the Evolution of Genetic Variance Using Genetic Covariance Tensors, Hines et al.,

Paper 20: Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation, Li et al.,

Paper 21: MultiFacTV: Module Detection from Higher-Order Time Series Biological Data, Li et al.,

Paper 22: Subgraph Augmented Nonnegative Tensor Factorization (SANTF) for Modeling Clinical Narrative Text, Luo et al.,

October 19:

- Lab 2 "Data Clinic"

October 24:

- Slides 5: The SVD as a Transform

- Quantum Harmonic Oscillator from Wikipedia

- Image Compression via the SVD from Mathworld

- Image Compression via the Fourier Transform from Mathworld

October 26:

- Mathematical variations on the SVD and PCA for blind source separation (BSS):

- Independent component analysis (ICA):

Paper 23: Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images, Olshausen and Field,

Paper 24: The "Independent Components" of Natural Scenes are Edge Filters, Bell and Sejnowski,

Paper 25: Linear Modes of Gene Expression Determined by Independent Component Analysis, Liebermeister,

- Nonnegative matrix factorization (NMF):

Paper 26: Learning the Parts of Objects by Non-Negative Matrix Factorization, Lee and Seung,

Paper 27: Metagenes and Molecular Pattern Discovery Using Matrix Factorization, Brunet et al.,

October 31:

- The tensor SVD of measured data:

November 2:

- Notebook 5: The tensor SVD of Measured Data

Mathematica Code: Notebook_5.nb

November 7:

- Kaplan-Meier survival analysis

November 9:

- Lab 2 Due In-Class

- The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:

November 14:

November 16:

- Neural networks:

Paper 28: Predicting Human Brain Activity Associated with the Meanings of Nouns, Mitchell et al.,

Paper 29: Integrating Multiple-Study Multiple-Subject fMRI Datasets Using Canonical Correlation Analysis, Rustandi et al., in

November 21:

- Verification and validation

- Example of verification:

Paper 30: Global Effects of DNA Replication and DNA Replication Origin Activity on Eukaryotic Gene Expression, Omberg et al.,

- Example of validation:

Paper 31: Retrospective Clinical Trial Experimentally Validates Glioblastoma Genome-Wide Pattern of DNA Copy-Number Alterations Predictor of Survival, Ponnapalli et al.,

November 24:

- Happy Thanksgiving!

November 28:

- "State of the project" presentations

November 30:

- "State of the project" presentations

December 5:

- "State of the project" presentations

December 7:

- "State of the project" presentations

- End-of-class celebration!

Happy Winter Break!

- See you in Spring 2023 in BME 6770: Genomic Signal Processing!

May be pivotal to your career:

- Helping AWS Customers Accelerate Success via Machine Learning,