Joint and individual analysis of breast cancer histologic images and genomic covariates. (English) Zbl 1498.62197

Summary: The two main approaches in the study of breast cancer are histopathology (analyzing visual characteristics of tumors) and genomics. While both histopathology and genomics are fundamental to cancer research, the connections between these fields have been relatively superficial. We bridge this gap by investigating the Carolina Breast Cancer Study through the development of an integrative, exploratory analysis framework. Our analysis gives insights – some known, some novel – that are engaging to both pathologists and geneticists. Our analysis framework is based on angle-based joint and individual variation explained (AJIVE) for statistical data integration and exploits convolutional neural networks (CNNs) as a powerful, automatic method for image feature extraction. CNNs raise interpretability issues that we address by developing novel methods to explore visual modes of variation captured by statistical algorithms (e.g., PCA or AJIVE) applied to CNN features.


62P10 Applications of statistics to biology and medical sciences; meta analysis
62H25 Factor analysis and principal components; correspondence analysis
68T07 Artificial neural networks and deep learning
Full Text: DOI arXiv


