×

Bilevel optimization with nonsmooth lower level problems. (English) Zbl 1444.94018

Aujol, Jean-François (ed.) et al., Scale space and variational methods in computer vision. 5th international conference, SSVM 2015, Lège-Cap Ferret, France, May 31 – June 4, 2015. Proceedings. Cham: Springer. Lect. Notes Comput. Sci. 9087, 654-665 (2015).
Summary: We consider a bilevel optimization approach for parameter learning in nonsmooth variational models. Existing approaches solve this problem by applying implicit differentiation to a sufficiently smooth approximation of the nondifferentiable lower level problem. We propose an alternative method based on differentiating the iterations of a nonlinear primal-dual algorithm. Our method computes exact (sub)gradients and can be applied also in the nonsmooth setting. We show preliminary results for the case of multi-label image segmentation.
For the entire collection see [Zbl 1362.68008].

MSC:

94A08 Image processing (compression, reconstruction, etc.) in information and communication theory
49M25 Discrete approximations in optimal control
68T05 Learning and adaptive systems in artificial intelligence

Software:

Spearmint; SMAC; iPiano
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Kunisch, K., Pock, T.: A bilevel optimization approach for parameter learning in variational models. SIAM Journal on Imaging Sciences 6(2), 938-983 (2013) · Zbl 1280.49053 · doi:10.1137/120882706
[2] Reyes, J.C.D.L., Schönlieb, C.B.: Image denoising: Learning noise distribution via pde-constrained optimisation. Inverse Problems and Imaging 7, 1183-1214 (2013) · Zbl 1283.49005 · doi:10.3934/ipi.2013.7.1183
[3] Samuel, K., Tappen, M.: Learning optimized MAP estimates in continuously-valued MRF models. In: International Conference on Computer Vision and Pattern Recognition (CVPR), 477-484 (2009)
[4] Tappen, M., Samuel, K., Dean, C., Lyle, D.: The logistic random field-a convenient graphical model for learning parameters for MRF-based labeling. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8 (2008)
[5] Wainwright, M., Jaakkola, T., Willsky, A.: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches. IEEE Transactions on Information Theory 51, 3697-3717 (2002) · Zbl 1318.94025 · doi:10.1109/TIT.2005.856938
[6] Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Computation 14(8), 1771-1800 (2002) · Zbl 1010.68111 · doi:10.1162/089976602760128018
[7] Taskar, B., Chatalbashev, V., Koller, D., Guestrin, C.: Learning structured prediction models: a large margin approach. In: International Conference on Machine Learning (ICML), pp. 896-903 (2005)
[8] LeCun, Y., Huang, F.: Loss functions for discriminative training of energy-based models. In: International Workshop on Artificial Intelligence and Statistics (2005)
[9] Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems (NIPS), pp. 2951-2959 (2012)
[10] Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507-523. Springer, Heidelberg (2011) · doi:10.1007/978-3-642-25566-3_40
[11] Eggensperger, K., Feurer, M., Hutter, F., Bergstra, J., Snoek, J., Hoos, H., Leyton-Brown, K.: Towards an empirical foundation for assessing Bayesian optimization of hyperparameters. In: NIPS Workshop (2013)
[12] Ranftl, R., Pock, T.: A deep variational model for image segmentation. In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 104-115. Springer, Heidelberg (2014) · doi:10.1007/978-3-319-11752-2_9
[13] Peyré, G., Fadili, J.: Learning analysis sparsity priors. In: Proceedings of Sampta (2011)
[14] Chen, Y., Pock, T., Ranftl, R., Bischof, H.: Revisiting loss-specific training of filter-based MRFs for image restoration. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 271-281. Springer, Heidelberg (2013) · doi:10.1007/978-3-642-40602-7_30
[15] Chen, Y., Ranftl, R., Pock, T.: Insights into analysis operator learning: From patch-based sparse models to higher order MRFs. IEEE Transactions on Image Processing 23(3), 1060-1072 (2014) · Zbl 1374.94065 · doi:10.1109/TIP.2014.2299065
[16] Tappen, M.: Utilizing variational optimization to learn MRFs. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8 (2007)
[17] Domke, J.: Generic methods for optimization-based modeling. In: International Workshop on Artificial Intelligence and Statistics, pp. 318-326 (2012)
[18] Geman, D., Reynolds, G.: Constrained restoration and the recovery of discontinuities. IEEE Transactions on Pattern Analysis and Machine Intelligence 14, 367-383 (1992) · doi:10.1109/34.120331
[19] Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision 40(1), 120-145 (2011) · Zbl 1255.68217 · doi:10.1007/s10851-010-0251-1
[20] Chambolle, A., Pock, T.: On the ergodic convergence rates of a first-order primal-dual algorithm. Technical report (2014) (to appear) · Zbl 1350.49035
[21] Deledalle, C.A., Vaiter, S., Fadili, J., Peyré, G.: Stein Unbiased GrAdient estimator of the Risk (SUGAR) for multiple parameter selection. SIAM Journal on Imaging Sciences 7(4), 2448-2487 (2014) · Zbl 1361.94012 · doi:10.1137/140968045
[22] Foo, C.S., Do, C., Ng, A.: Efficient multiple hyperparameter learning for log-linear models. In: Advances in Neural Information Processing Systems (NIPS), pp. 377-384. Curran Associates, Inc. (2008)
[23] Borenstein, E., Sharon, E., Ullman, S.: Combining top-down and bottom-up segmentation. In: International Conference on Computer Vision and Pattern Recognition Workshop (CVPR) (2004)
[24] Ochs, P., Chen, Y., Brox, T., Pock, T.: ipiano: Inertial proximal algorithm for non-convex optimization. SIAM Journal on Imaging Sciences 7(2), 1388-1419 (2014) · Zbl 1296.90094 · doi:10.1137/130942954
[25] Liu, D. · Zbl 0696.90048 · doi:10.1007/BF01589116
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.