- For postdoctoral, graduate, and advanced undergraduate students in Engineering, Sciences, and Medicine, and professionals in industry.
- Fall 2019, Mondays and Wednesdays 11:50am–1:10pm, LCB 115. Prerequisites: Some experience programming and instructor approval.
- Databases, from the Cancer Genome Atlas (TCGA) at the Genomic Data Commons (GDC) to the Utah Population Database (UPDB).
- Data types, from omics, imaging, and patient clinical information to biomedical samples and model organisms and systems.
- Algorithms, from the singular value decomposition (SVD) and principal component analysis (PCA) to multi-tensor decompositions, neural networks, and deep learning.
- Applications, from the Luria-Delbrück experiment to personalized cancer diagnostics, prognostics, and therapeutics.
- Proving mathematical theorems and programming symbolic computations.
- Designing algorithms and programming numerical computations.
- Working with databases and modeling biomedical data.
- In-class presentations of scientific journal articles and patents.
- Participation in guest lectures and seminars on campus and discussions of conference reports.
- End-of-class celebration.
- Syllabus
- Fall 2019 Calendar
- Safety
- Health, Wellness, and Counseling
- Student Code

100% grade = 30% labs, 30% presentation, 30% class project, 10% class participation; class attendance is required; late assignments are not accepted.

Topics:

We will cover concepts in data science and machine learning, and their applications to discovery of principles from biomedical data.

Skills:

Activities:

August 26:

- Welcome!

- How Bright Promise in Cancer Testing Fell Apart,

- The SVD in the news:

If You Liked This, You're Sure to Love That,

- PCA for face recognition:

Paper 1: Low-Dimensional Procedure for the Characterization of Human Faces, Sirovich and Kirby,

Paper 2: Eigenfaces for Recognition, Turk and Pentland,

- Mathematics of the SVD:

- Notebook 1: Computation and Visualization of the SVD

Mathematica Code: Notebook_1.nb

February 29:

- Gene H. Golub's Birthday!

Paper 3: Calculating the Singular Values and Pseudo-Inverse of a Matrix, Golub and Kahan,

August 28:

- Mathematical properties of the SVD

- In-Class Work on Lab 1:

Code the SVD of synthetic data and its visualization. Test and debug your code.

August 29, Thursday, 10:00–11:00am, WEB 3780, in lieu of any one Lab:

- Screening of the Amazon Web Services (AWS) Public Sector Webinar

NIH’s Strategic Vision for Data Science: Enabling a FAIR-Data Ecosystem

Susan K. Gregurick, Ph.D.

Director, Division of Biophysics, Biomedical Technology, and Computional Biosciences

National Institute of General Medical Sciences (NIGMS), NIH

September 2:

- Happy Labor Day!

September 4:

- SVD of Synthetic Data:

- Notebook 2: SVD of Synthetic Data

Mathematica Code: Notebook_2.nb

September 9:

- Composition and decomposition of synthetic data:

September 11:

- Slides 1: Examples of SVD of measured data

- More examples of SVD of measured data:

Paper 4: Singular Value Decomposition for Genome-Wide Expression Data Processing and Modeling, Alter et al.,

Patent 1: Method for Node Ranking in a Linked Database, Page,

Paper 5: A Rapid Genome-Scale Response of the Transcriptional Oscillator to Perturbation Reveals a Period-Doubling Path to Phenotypic Change, Li and Klevecz,

Paper 6: Coordinated Metabolic Transitions During

September 13, Friday, 8:00am–6:15pm, SMBB 2650, in lieu of any one Lab:

- The Utah Biomedical Engineering Conference (UBEC)

September 16:

September 18:

Orly Alter from the @UUtah PS-OP ÒRetrospective clinical trial revalidates glioblastoma DNA copy-number genotype predictor of survival phenotype #2019PSON pic.twitter.com/YUGjEGhSFJ

— NCI PhysicalSciences (@NCIPhySci) September 20, 2019

September 23:

- Guest Lecture:

Introduction to the Utah Population Database

Heidi A. Hanson, Ph.D.

Assistant Professor of Surgery at the Utah Population Database, the Huntsman Cancer Institute, and the Scientific Computing and Imaging Institute

- Readings on the SVD and deep learning:

Gilbert Strang; no holds barred. pic.twitter.com/mUepPn5o1q

— mat kelcey (@mat_kelcey) September 20, 2019

September 25:

- Lab 1 Due In-Class

- In-Class Work on Lab 2:

Compute and visualize the SVD of your data. Test and debug your code. Interpret your data based upon its SVD. Use at least two different approaches each for preprocessing and sorting your data and for assessing the statistical significance of your interpretation.

September 30:

- Example of TCGA data:

Paper 7: Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas, TCGA Research Network,

- Examples of enrichment analyses:

Paper 8: Systematic Determination of Genetic Network Architecture, Tavazoie et al.,

Paper 9: GOrilla: A Tool for Discovery and Visualization of Enriched GO Terms in Ranked Gene Lists, Eden et al.,

October 2:

- Example of interpretation of TCGA data:

Paper 10: Mathematically Universal and Biologically Consistent Astrocytoma Genotype Encodes for Transformation and Predicts Survival Phenotype, Aiello et al.,

October 7 and 9:

- Happy Fall Break!

October 21, Monday, 4:00–6:00pm, WEB 3780:

- Lab 2 "Data Clinic"

October 23:

- Mathematics of a tensor SVD, the higher-order SVD (HOSVD):

Paper 11: A Multilinear Singular Value Decomposition, De Lathauwer et al.,

- Slides 2: Examples of the HOSVD of measured data

- More examples of HOSVD of measured data:

Paper 12: A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data from Different Studies, Omberg et al.,

Paper 13: Characterizing the Evolution of Genetic Variance Using Genetic Covariance Tensors, Hines et al.,

Paper 14: Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation, Li et al.,

Paper 15: MultiFacTV: Module Detection from Higher-Order Time Series Biological Data, Li et al.,

Paper 16: Subgraph Augmented Nonnegative Tensor Factorization (SANTF) for Modeling Clinical Narrative Text, Luo et al.,

October 24, Thursday, 8:30–11:00am, HSEB, in lieu of any one Lab:

- Annual Cancer Summit of the American Cancer Society Utah Cancer Action Network (ACS CAN)

October 28:

- Computation of the HOSVD:

October 28, Monday, 4:00–6:00pm, WEB 3780:

- Lab 2 "Data Clinic"

October 30, Wednesday, 12:15–1:20pm, Intermountain Medical Center, 5121 S. Cottonwood St., Murray, in lieu of any one Lab:

- Keynote of the 12th Annual Huntsman-Intermountain Cancer Care Program Conference

The Origins of Cancer

Robert A. Weinberg, Ph.D.

Ludwig Professor for Cancer Research and Member of the Whitehead Institute

Massachusetts Institute of Technology (MIT)

November 4:

- Selection of a cutoff of the singular values:

Paper 17: Component Retention in Principal Component Analysis with Application to cDNA Microarray Data, Cangelosi and Goriely,

Paper 18: The Optimal Hard Threshold for Singular Values is 4/√3, Gavish and Donoho,

- Robust PCA and removal of outliers:

Paper 19: Sparsity Control for Robust Principal Component Analysis, Mateos and Giannakis,

Paper 20: Robust Principal Component Analysis? Candès et al.,

- From the SVD to PCA:

- Slides 3: The SVD vs. PCA

Paper 21: Correspondence Analysis Applied to Microarray Data, Fellenberg et al.,

November 6:

- Slides 4: SVD as a Transform

- Quantum Harmonic Oscillator from Wikipedia

- Image Compression via the SVD from Mathworld

- Image Compression via the Fourier Transform from Mathworld

- A Hard Day's Night Opening Chord from Wikipedia

November 11:

- "State of the project" presentations

November 13:

- "State of the project" presentations

November 18:

- Tensor SVD of measured data:

- Notebook 3: Tensor SVD of Measured Data

Mathematica Code: Notebook_3.nb

- Tensor SVD of synthetic data:

- Notebook 4: Tensor SVD of Synthetic Data

Mathematica Code: Notebook_4.nb

November 20:

- The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:

- Neural networks:

Paper 27: Predicting Human Brain Activity Associated with the Meanings of Nouns, Mitchell et al.,

Paper 28: Integrating Multiple-Study Multiple-Subject fMRI Datasets Using Canonical Correlation Analysis, Rustandi et al.,

- Readings on deep learning:

November 25:

- e-Guest Lecture:

Careers in Data Science: Making the World a Better Place

Sri Priya Ponnapalli, Ph.D.

Principal Data Scientist and Sports ML Manager, Amazon AI (Palo Alto, CA),

Faculty, Rutgers Business School (Newark, NJ), and

CEO and Co-Founder, Eigengene, Inc. (Palo Alto, CA)

November 27:

- Mathematical variations on the SVD and PCA:

- Independent component analysis (ICA):

Paper 22: Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images, Olshausen and Field,

Paper 23: The "Independent Components" of Natural Scenes are Edge Filters, Bell and Sejnowski,

Paper 24: Linear Modes of Gene Expression Determined by Independent Component Analysis, Liebermeister,

- Nonnegative matrix factorization (NMF):

Paper 25: Learning the Parts of Objects by Non-Negative Matrix Factorization, Lee and Seung,

Paper 26: Metagenes and Molecular Pattern Discovery Using Matrix Factorization, Brunet et al.,

November 28:

- Happy Thanksgiving!

December 2:

- The "perceptron," i.e., single-layer neural network, as a mathematical variation on the SVD:

- Neural networks:

Paper 27: Predicting Human Brain Activity Associated with the Meanings of Nouns, Mitchell et al.,

Paper 28: Integrating Multiple-Study Multiple-Subject fMRI Datasets Using Canonical Correlation Analysis, Rustandi et al.,

- Readings on deep learning:

December 4:

- End-of-class celebration!

- Project update presentations

Happy Winter Break!

- DNA from xkcd

See you in Spring 2020 in BIOEN 6770: Genomic Signal Processing