×

zbMATH — the first resource for mathematics

Tensor regression networks. (English) Zbl 07255154
Summary: Convolutional neural networks typically consist of many convolutional layers followed by one or more fully connected layers. While convolutional layers map between high-order activation tensors, the fully connected layers operate on flattened activation vectors. Despite empirical success, this approach has notable drawbacks. Flattening followed by fully connected layers discards multilinear structure in the activations and requires many parameters. We address these problems by incorporating tensor algebraic operations that preserve multilinear structure at every layer. First, we introduce Tensor Contraction Layers (TCLs) that reduce the dimensionality of their input while preserving their multilinear structure using tensor contraction. Next, we introduce Tensor Regression Layers (TRLs), which express outputs through a low-rank multilinear mapping from a high-order activation tensor to an output tensor of arbitrary order. We learn the contraction and regression factors end-to-end, and produce accurate nets with fewer parameters. Additionally, our layers regularize networks by imposing low-rank constraints on the activations (TCL) and regression weights (TRL). Experiments on ImageNet show that, applied to VGG and ResNet architectures, TCLs and TRLs reduce the number of parameters compared to fully connected layers by more than 65% while maintaining or increasing accuracy. In addition to the space savings, our approach’s ability to leverage topological structure can be crucial for structured data such as MRI. In particular, we demonstrate significant performance improvements over comparable architectures on three tasks associated with the UK Biobank dataset.
MSC:
68T05 Learning and adaptive systems in artificial intelligence
PDF BibTeX XML Cite
Full Text: Link
References:
[1] Michael L Alosco, Kelly M Stanek, Rachel Galioto, Mayuresh S Korgaonkar, Stuart M Grieve, Adam M Brickman, Mary Beth Spitznagel, and John Gunstad. Body mass index and brain structure in healthy children and adolescents.International Journal of Neuroscience, 124(1):49-55, 2014.
[2] Animashree Anandkumar, Rong Ge, Daniel J Hsu, Sham M Kakade, and Matus Telgarsky. Tensor decompositions for learning latent variable models.Journal of Machine Learning
[3] Nancy C Andreasen, Michael Flaum, Victor Swayze, Daniel S O’Leary, Randall Alliger, Gregg Cohen, James Ehrhardt, William T Yuh, et al. Intelligence and brain structure in normal individuals.American Journal of Psychiatry, 150:130-130, 1993.
[4] Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems.CoRR, abs/1512.01274, 2015.
[5] Yunpeng Chen, Xiaojie Jin, Bingyi Kang, Jiashi Feng, and Shuicheng Yan. Sharing residual units through collective tensor factorization to improve deep neural networks. InIJCAI, pages 635-641. International Joint Conferences on Artificial Intelligence Organization, 7 2018.
[6] A. Cichocki, D. Mandic, L. De Lathauwer, G. Zhou, Q. Zhao, C. Caiafa, and H. A. PHAN. Tensor decompositions for signal processing applications: From two-way to multiway component analysis.IEEE Signal Processing Magazine, 32(2):145-163, March 2015.
[7] Andrzej Cichocki, Rafal Zdunek, Anh Huy Phan, and Shun-Ichi Amari.Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation. John Wiley & Sons, Ltd, 2009.
[8] Nadav Cohen, Or Sharir, and Amnon Shashua. On the expressive power of deep learning: A tensor analysis. In Vitaly Feldman, Alexander Rakhlin, and Ohad Shamir, editors,29th
[9] James H Cole, Rudra PK Poudel, Dimosthenis Tsagkrasoulis, Matthan WA Caan, Claire Steves, Tim D Spector, and Giovanni Montana. Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker.NeuroImage, 163: 115-124, 2017.
[10] James H Cole, Stuart J Ritchie, Mark E Bastin, MC Vald´es Hern´andez, S Mu˜noz Maniega, Natalie Royle, Janie Corley, Alison Pattie, Sarah E Harris, Qian Zhang, et al. Brain age predicts mortality.Molecular psychiatry, 23(5):1385, 2018.
[11] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. InCVPR, 2009.
[12] Katja Franke, Christian Gaser, Tessa J Roseboom, Matthias Schwab, and Susanne R de Rooij. Premature brain aging in humans exposed to maternal nutrient restriction during early gestation.NeuroImage, 173:460-471, 2018.
[13] Christian Gaser, Katja Franke, Stefan Kl¨oppel, Nikolaos Koutsouleris, Heinrich Sauer, Alzheimer’s Disease Neuroimaging Initiative, et al. Brainage in mild cognitive impaired patients: predicting the conversion to alzheimers disease.PloS one, 8(6), 2013.
[14] Edward Grefenstette and Mehrnoosh Sadrzadeh. Experimental support for a categorical compositional distributional model of meaning. InProceedings of the Conference on
[15] W. Guo, I. Kotsia, and I. Patras. Tensor learning for regression.IEEE Transactions on Image Processing, 21(2):816-827, Feb 2012.
[16] Benjamin D. Haeffele and Ren´e Vidal. Global optimality in tensor factorization, deep learning, and beyond.CoRR, abs/1506.07540, 2015.
[17] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770-778, June
[18] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. In Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling, editors,
[19] Holger Hoefling. A path algorithm for the fused lasso signal approximator.Journal of Computational and Graphical Statistics, 19(4):984-1006, 2010. doi: 10.1198/jcgs.2010.
[20] Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kilian Q. Weinberger. Deep networks with stochastic depth. In Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling, editors,
[21] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. InIEEE conference on computer vision and pattern
[22] Madhura Ingalhalikar, Alex Smith, Drew Parker, Theodore D Satterthwaite, Mark A Elliott, Kosha Ruparel, Hakon Hakonarson, Raquel E Gur, Ruben C Gur, and Ragini Verma. Sex differences in the structural connectome of the human brain.Proceedings of the National
[23] Sergey Ioffe and Christian Szegedy.Batch normalization: Accelerating deep network training by reducing internal covariate shift. InInternational Conference on International
[24] Majid Janzamin, Hanie Sedghi, and Anima Anandkumar. Generalization bounds for neural networks through tensor factorization.CoRR, abs/1506.08473, 2015a.
[25] Majid Janzamin, Hanie Sedghi, and Anima Anandkumar. Beating the perils of non-convexity: Guaranteed training of neural networks using tensor methods.CoRR, 2015b.
[26] Keith A Johnson, Nick C Fox, Reisa A Sperling, and William E Klunk. Brain imaging in alzheimer disease.Cold Spring Harbor perspectives in medicine, 2(4):a006213, 2012.
[27] Yong-Deok Kim, Eunhyeok Park, Sungjoo Yoo, Taelim Choi, Lu Yang, and Dongjun Shin. Compression of deep convolutional neural networks for fast and low power mobile applications. InICLR, 2016.
[28] Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications.SIAM REVIEW, 51(3):455-500, 2009.
[29] Marian Kolenic, Katja Franke, Jaroslav Hlinka, Martin Matejka, Jana Capkova, Zdenka Pausova, Rudolf Uher, Martin Alda, Filip Spaniel, and Tomas Hajek. Obesity, dyslipidemia and brain age in first-episode psychosis.Journal of psychiatric research, 99:151-158, 2018.
[30] Jean Kossaifi, Aran Khanna, Zachary Lipton, Tommaso Furlanello, and Anima Anandkumar. Tensor contraction layers for parsimonious deep nets. InThe IEEE Conference on
[31] Jean Kossaifi, Yannis Panagakis, Anima Anandkumar, and Maja Pantic. Tensorly: Tensor learning in python.Journal of Machine Learning Research, 20(26):1-6, 2019.
[32] Vadim Lebedev, Yaroslav Ganin, Maksim Rakhuba, Ivan V. Oseledets, and Victor S. Lempitsky. Speeding-up convolutional neural networks using fine-tuned cp-decomposition.
[33] Kathryn L Mills and Christian K Tamnes. Methods and considerations for longitudinal structural brain imaging analysis across development.Developmental cognitive neuroscience, 9: 172-190, 2014.
[34] Kaida Ning, Lu Zhao, Will Matloff, Fengzhu Sun, and Arthur Toga. Association of brain age with smoking, alcohol consumption, and genetic variants.bioRxiv, page 469924, 2018.
[35] Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, and Dmitry Vetrov. Tensorizing neural networks. InProceedings of the 28th International Conference on Neural Information
[36] Evangelos E. Papalexakis, Christos Faloutsos, and Nicholas D. Sidiropoulos. Tensors for data mining and data fusion: Models, applications, and scalable algorithms.ACM Trans.
[37] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. InNeurIPS Autodiff Workshop, 2017.
[38] Guillaume Rabusseau and Hachem Kadri. Low-rank regression with tensor responses. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors,Advances
[39] Cyrus A Raji, April J Ho, Neelroop N Parikshak, James T Becker, Oscar L Lopez, Lewis H Kuller, Xue Hua, Alex D Leow, Arthur W Toga, and Paul M Thompson. Brain structure and obesity.Human brain mapping, 31(3):353-364, 2010.
[40] Tim Salimans and Durk P Kingma. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems
[41] Hanie Sedghi and Anima Anandkumar. Training input-output recurrent neural networks through spectral methods.CoRR, abs/1603.00954, 2016.
[42] Or Sharir and Amnon Shashua. On the expressive power of overlapping architectures of deep learning. InInternational Conference on Learning Representations (ICLR), 2018. URLhttps://openreview.net/forum?id=HkNGsseC-.
[43] Y. Shi, U. N. Niranjan, A. Anandkumar, and C. Cecka. Tensor contractions with extended blas kernels on cpu and gpu. In2016 IEEE 23rd International Conference on High
[44] N. D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E. E. Papalexakis, and C. Faloutsos. Tensor decomposition for signal processing and machine learning.IEEE Transactions on Signal Processing, 65(13):3551-3582, 2017.
[45] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. InInternational Conference on Learning Representations (ICLR), 2015.
[46] Olaf Sporns, Giulio Tononi, and Rolf K¨otter. The human connectome: a structural description of the human brain.PLoS computational biology, 1(4), 2005.
[47] Cathie Sudlow, John Gallacher, Naomi Allen, Valerie Beral, Paul Burton, John Danesh, Paul Downey, Paul Elliott, Jane Green, Martin Landray, et al. Uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age.PLoS medicine, 12(3):e1001779, 2015.
[48] Cheng Tai, Tong Xiao, Xiaogang Wang, and Weinan E. Convolutional neural networks with low-rank regularization.CoRR, abs/1511.06067, 2015.
[49] Ryan J. Tibshirani and Jonathan Taylor. The solution path of the generalized lasso.Ann. Statist., 39(3):1335-1371, 06 2011. doi: 10.1214/11-AOS878. URLhttps://doi.org/10. 1214/11-AOS878.
[50] P´al Vakli, Regina J De´ak-Meszl´enyi, Tibor Auer, and Zolt´an Vidny´anszky. Predicting body mass index from structural mri brain images using a deep convolutional neural network.
[51] M. A. O. Vasilescu and Demetri Terzopoulos. Multilinear analysis of image ensembles: Tensorfaces. InProceedings of the 7th European Conference on Computer Vision-Part I, ECCV ’02, pages 447-460, London, UK, UK, 2002. Springer-Verlag.
[52] Yongxin Yang and Timothy M. Hospedales. Deep multi-task representation learning: A tensor factorisation approach.ICLR, 2017.
[53] Rose Yu and Yan Liu. Learning from multiway data: Simple and efficient tensor regression. InInternational Conference on Machine Learning (ICML), volume 48, pages 373-381, New York, New York, USA, 20-22 Jun 2016. PMLR.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.