zbMATH — the first resource for mathematics

Characterization of the equivalence of robustification and regularization in linear and matrix regression. (English) Zbl 1403.62040
Summary: The notion of developing statistical methods in machine learning which are robust to adversarial perturbations in the underlying data has been the subject of increasing interest in recent years. A common feature of this work is that the adversarial robustification often corresponds exactly to regularization methods which appear as a loss function plus a penalty. In this paper we deepen and extend the understanding of the connection between robustification and regularization (as achieved by penalization) in regression problems. Specifically, in the context of linear regression, we characterize precisely under which conditions on the model of uncertainty used and on the loss function penalties robustification and regularization are equivalent. We extend the characterization of robustification and regularization to matrix regression problems (matrix completion and principal component analysis).

62F35 Robustness and adaptive procedures (parametric inference)
62H12 Estimation in multivariate analysis
62H25 Factor analysis and principal components; correspondence analysis
62J07 Ridge regression; shrinkage estimators (Lasso)
68T05 Learning and adaptive systems in artificial intelligence
90C25 Convex programming
Full Text: DOI arXiv
[1] Bauschke, H. H.; Combettes, P. L., Convex analysis and monotone operator theory in Hilbert spaces, (2011), Springer · Zbl 1218.47001
[2] Ben-Tal, A.; Ghaoui, L. E.; Nemirovski, A., Robust optimization, (2009), Princeton University Press
[3] Ben-Tal, A.; Hazan, E.; Koren, T.; Mannor, S., Oracle-based robust optimization via online learning, Operations Research, 63, 3, 628-638, (2015) · Zbl 1327.90379
[4] Bertsimas, D.; Brown, D. B.; Caramanis, C., Theory and applications of robust optimization, SIAM Review, 53, 3, 464-501, (2011) · Zbl 1233.90259
[5] Bertsimas, D., Gupta, V., & Kallus, N. (2017). Data-driven robust optimization. Mathematical Programming. · Zbl 1397.90298
[6] Bousquet, O.; Boucheron, S.; Lugosi, G., Advanced lectures on machine learning, (2004), Springer
[7] Boyd, S.; Vandenberghe, L., Convex optimization, (2004), Cambridge University Press · Zbl 1058.90049
[8] Bradic, J.; Fan, J.; Wang, W., Penalized composite quasi-likelihood for ultrahigh dimensional variable selection, Journal of the Royal Statistical Society, Series B, 73, 325-349, (2011) · Zbl 1411.62181
[9] Candès, E. J.; Li, X.; Ma, Y.; Wright, J., Robust principal component analysis?, Journal of the ACM, 58, 3, 11:1-37, (2011) · Zbl 1327.62369
[10] Candès, E.; Recht, B., Exact matrix completion via convex optimization, Communications of the ACM, 55, 6, 111-119, (2012)
[11] Caramanis, C.; Mannor, S.; Xu, H., Optimization for machine learning, (2011), MIT Press
[12] Carroll, R. J.; Ruppert, D.; Stefanski, L. A.; Crainiceanu, C. M., Measurement error in nonlinear models: A modern perspective, (2006), CRC Press · Zbl 1119.62063
[13] Croux, C.; Ruiz-Gazen, A., High breakdown estimators for principal components: the projection-pursuit approach revisited, Journal of Multivariate Analysis, 95, 206-226, (2005) · Zbl 1065.62040
[14] De Mol, C.; De Vito, E.; Rosasco, L., Elastic-net regularization in learning theory, Journal of Complexity, 25, 2, 201-230, (2009) · Zbl 1319.62087
[15] Eckart, C.; Young, G., The approximation of one matrix by another of lower rank, Psychometrika, 1, (1936) · JFM 62.1075.02
[16] Fan, J.; Fan, Y.; Barut, E., Adaptive robust variable selection, The Annals of Statistics, 42, 1, 324-351, (2014) · Zbl 1296.62144
[17] Fazel, M., Matrix rank minimization with applications, (2002), (Ph.D. thesis). Stanford University
[18] Ghaoui, L. E.; Lebret, H., Robust solutions to least-squares problems with uncertain data, SIAM Journal of Matrix Analysis and Applications, 18, 4, 1035-1064, (1997) · Zbl 0891.65039
[19] Golub, G. H.; Van Loan, C. F., An analysis of the total least squares problem, SIAM Journal of Numerical Analysis, 17, 6, 883-893, (1980) · Zbl 0468.65011
[20] Goodfellow, I. J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S., Generative adversarial nets, Advances in neural information processing systems 27, 2672-2680, (2014)
[21] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014b). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
[22] Hampel, F. R., The influence curve and its role in robust estimation, Journal of the American Statistical Association, 69, 383-393, (1974) · Zbl 0305.62031
[23] Hastie, T.; Tibshirani, R.; Friedman, J., The elements of statistical learning: data mining, inference, and prediction, (2009), Springer · Zbl 1273.62005
[24] Hill, R. W., Robust regression when there are outlires in the carriers, (1977), (Ph.D. thesis). Harvard University
[25] Horn, R. A.; Johnson, C. R., Matrix analysis, (2013), Cambridge University Press · Zbl 1267.15001
[26] Huber, P. J., Robust regression: asymptotics, conjectures and Monte Carlo, The Annals of Statistics, 1, 799-821, (1973) · Zbl 0289.62033
[27] Huber, P.; Ronchetti, E., Robust statistics, (2009), Wiley · Zbl 1276.62022
[28] Hubert, M.; Rousseeuw, P. J.; Aelst, S. V., High-breakdown robust multivariate methods, Statistical Science, 23, 1, 92-119, (2008) · Zbl 1327.62328
[29] Hubert, M.; Rousseeuw, P.; den Branden, K. V., ROBPCA: A new approach to robust principal components analysis, Technometrics, 47, 64-79, (2005)
[30] Kukush, A.; Markovsky, I.; Huffel, S. V., Consistency of the structured total least squares estimator in a multivariate errors-in-variables model, Journal of Statistical Planning and Inference, 133, 315-358, (2005) · Zbl 1213.62097
[31] Lewis, A. S., Robust regularization, Technical Report, (2002), School of ORIE, Cornell University
[32] Lewis, A.; Pang, C., Lipschitz behavior of the robust regularization, SIAM Journal on Control and Optimization, 48, 5, 3080-3104, (2009) · Zbl 1202.49047
[33] Mallows, C. L., On some topics in robustness, Technical Report, (1975), Bell Laboratories
[34] Markovsky, I.; Huffel, S. V., Overview of total least-squares methods, Signal Processing, 87, 2283-2302, (2007) · Zbl 1186.94229
[35] Morgenthaler, S., A survey of robust statistics, Statistical Methods and Applications, 15, 271-293, (2007) · Zbl 1181.62029
[36] Mosci, S.; Rosasco, L.; Santoro, M.; Verri, A.; Villa, S., Solving structured sparsity regularization with proximal methods, Proceedings of the Joint european conference on machine learning and knowledge discovery in databases, 418-433, (2010), Springer
[37] Recht, B.; Fazel, M.; Parrilo, P. A., Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, SIAM Review, 52, 3, 471-501, (2010) · Zbl 1198.90321
[38] Rousseeuw, P. J., Least Median of squares regression, Journal of the American Statistical Association, 79, 871-880, (1984) · Zbl 0547.62046
[39] Rousseeuw, P.; Leroy, A., Robust regression and outlier detection, (1987), Wiley · Zbl 0711.62030
[40] Salibian-Barrera, M.; Aelst, S. V.; Willems, G., PCA based on multivariate MM-estimators with fast and robust bootstrap, Journal of the American Statistical Association, 101, 475, 1198-1211, (2005) · Zbl 1120.62319
[41] Shaham, U., Yamada, Y., & Negahban, S. (2015). Understanding adversarial training: Increasing local stability of neural nets through robust optimization. arXiv preprint arXiv:1511.05432.
[42] SIGKDD; Netflix, Soft modelling by latent variables: the nonlinear iterative partial least squares (NIPALS) approach, Proceedings of the KDD Cup and Workshop, (2007)
[43] Tibshirani, R., Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B, 58, 267-288, (1996) · Zbl 0850.62538
[44] Tulabandhula, T., & Rudin, C. (2014). Robust optimization using machine learning for uncertainty sets. arXiv preprint arXiv:1407.1097. · Zbl 1319.68198
[45] Xu, H.; Caramanis, C.; Mannor, S., Robust regression and lasso, IEEE Transactions in Information Theory, 56, 7, 3561-3574, (2010) · Zbl 1366.62147
[46] Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B, 67, 2, 301-320, (2005) · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.