×

Nonlinear power method for computing eigenvectors of proximal operators and neural networks. (English) Zbl 1476.68244

Summary: Neural networks have revolutionized the field of data science, yielding remarkable solutions in a data-driven manner. For instance, in the field of mathematical imaging, they have surpassed traditional methods based on convex regularization. However, a fundamental theory supporting the practical applications is still in the early stages of development. We take a fresh look at neural networks and examine them via nonlinear eigenvalue analysis. The field of nonlinear spectral theory is still emerging, providing insights about nonlinear operators and systems. In this paper we view a neural network as a complex nonlinear operator and attempt to find its nonlinear eigenvectors. We first discuss the existence of such eigenvectors and analyze the kernel of ReLU networks. Then we study a nonlinear power method for generic nonlinear operators. For proximal operators associated to absolutely one-homogeneous convex regularization functionals, we can prove convergence of the method to an eigenvector of the proximal operator. This motivates us to apply a nonlinear method to networks which are trained to act similarly as a proximal operator. In order to take the nonhomogeneity of neural networks into account we define a modified version of the power method. We perform extensive experiments for different proximal operators and on various shallow and deep neural networks designed for image denoising. Proximal eigenvectors will be used for geometric analysis of graphs, as clustering or the computation of distance functions. For simple neural nets, we observe the influence of training data on the eigenvectors. For state-of-the-art denoising networks, we show that eigenvectors can be interpreted as (un)stable modes of the network, when contaminated with noise or other degradations.

MSC:

68T07 Artificial neural networks and deep learning
65H17 Numerical solution of nonlinear eigenvalue and eigenvector problems
65K10 Numerical optimization and variational techniques
68Q25 Analysis of algorithms and problem complexity
94A08 Image processing (compression, reconstruction, etc.) in information and communication theory
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] J.-F. Aujol, G. Gilboa, and N. Papadakis, Theoretical analysis of flows estimating eigenfunctions of one-homogeneous functionals, SIAM J. Imaging Sci., 11 (2018), pp. 1416-1440. · Zbl 1401.35232
[2] A. I. Aviles-Rivero, N. Papadakis, R. Li, S. M. Alsaleh, R. T. Tan, and C.-B. Schonlieb, Beyond Supervised Classification: Extreme Minimal Supervision with the Graph 1-Laplacian, preprint, arXiv:1906.08635, 2019.
[3] H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, CMS Books Math. Ouvages Math. SMC 408, Springer, New York, 2011. · Zbl 1218.47001
[4] G. Bellettini, V. Caselles, and M. Novaga, The total variation flow in \(\mathbb{R}^N\), J. Differential Equations, 184 (2002), pp. 475-525. · Zbl 1036.35099
[5] M. Benning and M. Burger, Ground states and singular vectors of convex variational regularization methods, Methods Appl. Anal., 20 (2013), pp. 295-334. · Zbl 1294.49020
[6] M. Benning, G. Gilboa, J. S. Grah, and C.-B. Schönlieb, Learning filter functions in regularisers by minimising quotients, in Proceedings of the International Conference on Scale Space and Variational Methods in Computer Vision, Springer, New York, 2017, pp. 511-523. · Zbl 1489.68218
[7] D. W. Boyd, The power method for \(\ell^p\) norms, Linear Algebra Appl., 9 (1974), pp. 95-101. · Zbl 0293.65024
[8] K. Bredies, K. Kunisch, and T. Pock, Total generalized variation, SIAM J. Imaging Sci., 3 (2010), pp. 492-526. · Zbl 1195.49025
[9] L. Bungert and M. Burger, Solution paths of variational regularization methods for inverse problems, Inverse Problems, 35 (2019), 105012. · Zbl 1426.49038
[10] L. Bungert and M. Burger, Asymptotic profiles of nonlinear homogeneous evolution equations of gradient flow type, J. Evol. Equ., 20 (2020), pp. 1061-1092. · Zbl 1447.35203
[11] L. Bungert, M. Burger, A. Chambolle, and M. Novaga, Nonlinear spectral decompositions by gradient flows of one-homogeneous functionals, Anal. PDE, 14 (2021), pp. 823-860. · Zbl 07365582
[12] L. Bungert, M. Burger, and D. Tenbrinck, Computing nonlinear eigenfunctions via gradient flow extinction, in Proceedings of the International Conference on Scale Space and Variational Methods in Computer Vision, Springer, New York, 2019, pp. 291-302.
[13] L. Bungert, Y. Korolev, and M. Burger, Structural analysis of an \(l\)-infinity variational problem and relations to distance functions, Pure Appl. Anal., 2 (2020), pp. 703-738. · Zbl 1462.35226
[14] M. Burger, G. Gilboa, M. Moeller, L. Eckardt, and D. Cremers, Spectral decompositions using one-homogeneous functionals, SIAM J. Imaging Sci., 9 (2016), pp. 1374-1408. · Zbl 1361.35123
[15] M. Burger and S. Osher, A Guide to the TV Zoo, in Level Set and PDE Based Reconstruction Methods in Imaging, Springer, New York, 2013, pp. 1-70. · Zbl 1342.94014
[16] V. Caselles, A. Chambolle, S. Moll, and M. Novaga, A characterization of convex calibrable sets in \(\mathbb{R}^n\) with respect to anisotropic norms, Ann. Inst. H. Poincaré Anal. Non Linéaire, Vol. 25, 2008, pp. 803-832. · Zbl 1144.52002
[17] A. Chambolle and T. Pock, A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vision, 40 (2011), pp. 120-145. · Zbl 1255.68217
[18] A. Chambolle and T. Pock, Approximating the total variation with finite differences or finite elements, in Geometric Partial Differential Equations-Part II, A. Bonito and R. H. Nochetto, eds., Handb. Numer. Anal. 22, Elsevier, New York, 2021, pp. 383-417, https://doi.org/10.1016/bs.hna.2020.10.005. · Zbl 1491.65046
[19] T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, Neural ordinary differential equations, in Proceedings of Advances in Neural Information Processing Systems, 2018, pp. 6571-6583.
[20] I. Cohen and G. Gilboa, Energy dissipating flows for solving nonlinear eigenpair problems, J. Comput. Phys., 375 (2018), pp. 1138-1158. · Zbl 1416.35173
[21] I. Cohen and G. Gilboa, Introducing the \(p\)-Laplacian spectra, Signal Process., 167 (2020), 107281.
[22] S. Dittmer, J. Emily, and P. Maass, Singular values for ReLU layers, IEEE Trans. Neural Netw. Learn. Syst., 31 (2020), pp. 3594-3605.
[23] A. Effland, E. Kobler, K. Kunisch, and T. Pock, An Optimal Control Approach to Early Stopping Variational Methods for Image Restoration, preprint, arXiv:1907.08488, 2019. · Zbl 1434.68626
[24] M. Egmont-Petersen, D. de Ridder, and H. Handels, Image processing with neural networks-A review, Pattern Recognition, 35 (2002), pp. 2279-2301. · Zbl 1006.68884
[25] J. Fan, W. Xu, Y. Wu, and Y. Gong, Human tracking using convolutional neural networks, IEEE Trans. Neural Netw., 21 (2010), pp. 1610-1623.
[26] T. Feld, J.-F. Aujol, G. Gilboa, and N. Papadakis, Rayleigh quotient minimization for absolutely one-homogeneous functionals, Inverse Problems, 35 (2019), 064003. · Zbl 1461.49059
[27] R. Garg, V. K. BG, G. Carneiro, and I. Reid, Unsupervised CNN for single view depth estimation: Geometry to the rescue, in Proceedings of the European Conference on Computer Vision, Springer, New York, 2016, pp. 740-756.
[28] A. Gautier, M. Hein, and F. Tudisco, Computing the Norm of Nonnegative Matrices and the Log-Sobolev Constant of Markov Chains, preprint, arXiv:2002.02447, 2020.
[29] A. Gautier, F. Tudisco, and M. Hein, A unifying Perron-Frobenius theorem for nonnegative tensors via multihomogeneous maps, SIAM J. Matrix Anal. Appl., 40 (2019), pp. 1206-1231. · Zbl 07122459
[30] A. Gautier, F. Tudisco, and M. Hein, The Perron-Frobenius theorem for multihomogeneous mappings, SIAM J. Matrix Anal. Appl., 40 (2019), pp. 1179-1205. · Zbl 07122458
[31] G. Gilboa, A Spectral Approach to Total Variation, in Proceedings of the International Conference on Scale Space and Variational Methods in Computer Vision, Springer, New York, 2013, pp. 36-47. · Zbl 1484.94007
[32] G. Gilboa, A total variation spectral framework for scale and texture analysis, SIAM J. Imaging Sci., 7 (2014), pp. 1937-1961. · Zbl 1361.94014
[33] G. Gilboa, Nonlinear Eigenproblems in Image Processing and Computer Vision, Springer, New York, 2018. · Zbl 1402.68009
[34] R. Gribonval, Should penalized least squares regression be interpreted as maximum a posteriori estimation?, IEEE Trans. Signal Process., 59 (2011), pp. 2405-2410. · Zbl 1392.94228
[35] E. Haber and L. Ruthotto, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), 014004. · Zbl 1426.68236
[36] E. Hait and G. Gilboa, Spectral total-variation local scale signatures for image manipulation and fusion, IEEE Trans. Image Process., 28 (2019), pp. 880-895. · Zbl 1409.94211
[37] E. Hait-Fraenkel and G. Gilboa, Revealing stable and unstable modes of denoisers through nonlinear eigenvalue analysis, J. Vis. Commun. Image Represent., 75 (2021), 103041.
[38] M. Hein and T. Bühler, An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA, in Proceedings of Advances in Neural Information Processing Systems, 2010, pp. 847-855.
[39] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 2012.
[40] O. Katzir, On the Scale-Space of Filters and Their Applications, Master’s thesis, Technion, Israel Institute of Technology, Haifa, Israel, 2017.
[41] B. Kawohl and P. Lindqvist, Positive eigenfunctions for the \(p\)-Laplace operator revisited, Analysis, 26 (2006), pp. 545-550. · Zbl 1136.35063
[42] E. Kobler, A. Effland, K. Kunisch, and T. Pock, Total Deep Variation for Linear Inverse Problems, preprint, arXiv:2001.05005, 2020. · Zbl 1434.68626
[43] F. Liu, C. Shen, and G. Lin, Deep convolutional neural fields for depth estimation from a single image, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 5162-5170.
[44] S. Mallat, Understanding deep convolutional networks, Philos. Trans. A, 374 (2016), 20150203.
[45] T. Meinhardt, M. Moller, C. Hazirbas, and D. Cremers, Learning proximal operators: Using denoising networks for regularizing inverse imaging problems, in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 1781-1790.
[46] Y. Meyer, Oscillating Patterns in Image Processing and in Some Nonlinear Evolution Equations: The 15th Dean Jacquelines B. Lewis Memorial Lectures, Univ. Lecture Ser. 22, AMS, Providence, RI. · Zbl 0987.35003
[47] R. Mises and H. Pollaczek-Geiringer, Praktische Verfahren der Gleichungsauflösung, ZAMM Z. Angew. Math. Mech., 9 (1929), pp. 152-164. · JFM 55.0305.01
[48] M. Moeller, J. Diebold, G. Gilboa, and D. Cremers, Learning nonlinear spectral filters for color image reconstruction, in Proceedings of the IEEE International Conference on Computer Vision (ICCV’15), 2015, pp. 289-297.
[49] M. Moeller, T. Mollenhoff, and D. Cremers, Controlling neural networks via energy dissipation, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3256-3265.
[50] H. Nam and B. Han, Learning multi-domain convolutional neural networks for visual tracking, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4293-4302.
[51] R. Z. Nossek and G. Gilboa, Flows generating nonlinear eigenfunctions, J. Sci. Comput., 75 (2018), pp. 859-888. · Zbl 1402.65025
[52] V. Papyan, Y. Romano, J. Sulam, and M. Elad, Theoretical foundations of deep learning via sparse representations: A multilayer sparse model and its connection to convolutional neural networks, IEEE Signal Process. Mag., 35 (2018), pp. 72-89.
[53] L. I. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noise removal algorithms, Phys. D, 60 (1992), pp. 259-268. · Zbl 0780.49028
[54] E. K. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, and W. Yin, Plug-and-Play Methods Provably Converge with Properly Trained Denoisers, preprint, arXiv:1905.05406, 2019.
[55] J. Schmidhuber, Deep learning in neural networks: An overview, Neural Networks, 61 (2015), pp. 85-117.
[56] A. Szlam and X. Bresson, Total variation and Cheeger cuts, in Proceedings of the International Conference on Machine Learning (ICML’10), 2010, pp. 1039-1046.
[57] M. Thorpe and Y. van Gennip, Deep Limits of Residual Neural Networks, preprint, arXiv:1810.11741, 2018.
[58] X. Xu, Y. Sun, J. Liu, B. Wohlberg, and U. S. Kamilov, Provable convergence of plug-and-play priors with MMSE denoisers, IEEE Signal Process. Lett., 27 (2020), pp. 1280-1284.
[59] L. Zeune, G. van Dalum, L. W. Terstappen, S. van Gils, and C. Brune, Multiscale segmentation via Bregman distances and nonlinear spectral analysis, SIAM J. Imaging Sci., 10 (2017), pp. 111-146. · Zbl 1372.35315
[60] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising, IEEE Trans. Image Process., 26 (2017), pp. 3142-3155. · Zbl 1409.94754
[61] K. Zhang, W. Zuo, and L. Zhang, FFDNet: Toward a fast and flexible solution for CNN-based image denoising, IEEE Trans. Image Process., 27 (2018), pp. 4608-4622.
[62] Z. Zhang, Polyhedron Under Linear Transformations, preprint, arXiv:1201.6584, 2012.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.