Fast cross-validation for multi-penalty high-dimensional ridge regression. (English) Zbl 07499921

Summary: High-dimensional prediction with multiple data types needs to account for potentially strong differences in predictive signal. Ridge regression is a simple model for high-dimensional data that has challenged the predictive performance of many more complex models and learners, and that allows inclusion of data type-specific penalties. The largest challenge for multi-penalty ridge is to optimize these penalties efficiently in a cross-validation (CV) setting, in particular for GLM and Cox ridge regression, which require an additional estimation loop by iterative weighted least squares (IWLS). Our main contribution is a computationally very efficient formula for the multi-penalty, sample-weighted hat-matrix, as used in the IWLS algorithm. As a result, nearly all computations are in low-dimensional space, rendering a speed-up of several orders of magnitude. We developed a flexible framework that facilitates multiple types of response, unpenalized covariates, several performance criteria and repeated CV. Extensions to paired and preferential data types are included and illustrated on several cancer genomics survival prediction problems. Moreover, we present similar computational shortcuts for maximum marginal likelihood and Bayesian probit regression. The corresponding R-package, multiridge, serves as a versatile standalone tool, but also as a fast benchmark for other more complex models and multi-view learners. Supplementary materials for this article are available online.


62-XX Statistics
Full Text: DOI


[1] Aben, N.; Vis, D. J.; Michaut, M.; Wessels, L. F., “TANDEM: A Two-stage Approach to Maximize Interpretability of Drug Response Models Based on Multiple Molecular Data Types, Bioinformatics, 32, i413-i420 (2016)
[2] Arlot, S.; Celisse, A., “A Survey of Cross-validation Procedures for Model Selection, Statistics Surveys, 4, 40-79 (2010) · Zbl 1190.62080
[3] Bernau, C.; Riester, M.; Boulesteix, A.-L.; Parmigiani, G.; Huttenhower, C.; Waldron, L.; Trippa, L., “Cross-study Validation for the Assessment of Prediction Algorithms, Bioinformatics, 30, i105-i112 (2014)
[4] Bondell, H.; Reich, B., “Consistent High-dimensional Bayesian Variable Selection Via Penalized Credible Regions, Journal of American Statistical Association, 107, 1610-1624 (2012) · Zbl 1258.62026
[5] Boulesteix, A.-L.; De Bin, R.; Jiang, X.; Fuchs, M., “IPF-LASSO: Integrative-Penalized Regression With Penalty Factors for Prediction Based on Multi-omics Data, Computational and Mathematical Methods in Medicine, 2017 (2017) · Zbl 1370.92016
[6] Boyle, E. A.; Li, Y. I.; Pritchard, J. K., “An Expanded View of Complex Traits: From Polygenic to Omnigenic, Cell, 169, 1177-1186 (2017)
[7] Broët, P.; Camilleri-Broët, S.; Zhang, S.; Alifano, M.; Bangarusamy, D.; Battistella, M.; Wu, Y.; Tuefferd, M.; Régnard, J.-F.; Lim, E., “Prediction of Clinical Outcome in Multiple Lung Cancer Cohorts by Integrative Genomics: Implications for Chemotherapy Selection, Cancer Research, 69, 1055-1062 (2009)
[8] Chaturvedi, N.; de Menezes, R. X.; Goeman, J. J., “Fused Lasso Algorithm for Cox Proportional Hazards and Binomial Logit Models With Application to Copy Number Profiles, Biometrical Journal, 56, 477-492 (2014) · Zbl 1441.62297
[9] Dobriban, E.; Wager, S., “High-dimensional Asymptotics of Prediction: Ridge Regression and Classification, The Annals of Statistics, 46, 247-279 (2018) · Zbl 1428.62307
[10] Ferkingstad, E.; Rue, H., “Improving the INLA Approach for Approximate Bayesian Inference for Latent Gaussian Models, Electronic Journal of Statistics, 9, 2706-2731 (2015) · Zbl 1329.62127
[11] Firinguetti, L., “A Generalized Ridge Regression Estimator and Its Finite Sample Properties, Communications in Statistics-Theory and Methods, 28, 1217-1229 (1999) · Zbl 0919.62069
[12] Fong, E.; Holmes, C., On the Marginal Likelihood and Cross-validation, arXiv:1905.08737 (2019) · Zbl 1441.62038
[13] Hastie, T.; Tibshirani, R., “Efficient Quadratic Regularization for Expression Arrays, Biostatistics, 5, 329-340 (2004) · Zbl 1154.62393
[14] Hawkins, D. M.; Yin, X., “A Faster Algorithm for Ridge Regression of Reduced Rank Data, Computational Statistics & Data Analysis, 40, 253-262 (2002) · Zbl 0993.62054
[15] Hoerl, A. E.; Kennard, R. W., “Ridge Regression: Biased Estimation for Nonorthogonal Problems, Technometrics, 12, 55-67 (1970) · Zbl 0202.17205
[16] Klau, S.; Jurinovic, V.; Hornung, R.; Herold, T.; Boulesteix, A.-L., “Priority-Lasso: A Simple Hierarchical Approach to the Prediction of Clinical Outcome Using Multi-omics Data, BMC Bioinformatics, 19, 322 (2018)
[17] Meijer, R.; Goeman, J., “Efficient approximate k-fold and leave-one-out cross-validation for ridge regression, Biometrical Journal, 55, 2, 141-155 (2013) · Zbl 1441.62437
[18] Ormerod, J. T.; Wand, M. P., “Explaining Variational Approximations, The American Statistician, 64, 140-153 (2010) · Zbl 1200.65007
[19] Özkale, M. R.; Lemeshow, S.; Sturdivant, R., “Logistic Regression Diagnostics in Ridge Regression, Computational Statistics, 33, 563-593 (2018) · Zbl 1417.62215
[20] Perrakis, K.; Mukherjee, S., “Scalable Bayesian Regression in High Dimensions With Multiple Data Sources, Journal of Computational and Graphical Statistics, 29, 28-39 (2019) · Zbl 07499269
[21] Rauschenberger, A.; Ciocănea-Teodorescu, I.; Jonker, M. A.; Menezes, R. X.; van de Wiel, M. A., “Sparse Classification With Paired Covariates, Advances in Data Analysis and Classification, 14, 571-588 (2020) · Zbl 1459.62007
[22] Salakhutdinov, R. R.; Hinton, G. E.; Platt, J. C.; Koller, D.; Singer, Y.; Roweis, S. T., Advances in Neural Information Processing Systems, 20, Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes, 1249-1256 (2008), Curran Associates, Inc
[23] Subramanian, A.; Tamayo, P.; Mootha, V. K.; Mukherjee, S.; Ebert, B. L.; Gillette, M. A.; Paulovich, A.; Pomeroy, S. L.; Golub, T. R.; Lander, E. S., “Gene Set Enrichment Analysis: A Knowledge-based Approach for Interpreting Genome-wide Expression Profiles, PNAS, 102, 15545-15550 (2005)
[24] Turlach, B. A., “An Even Faster Algorithm for Ridge Regression of Reduced Rank Data, Communications in Statistics - Theory and Methods, 50, 642-658 (2006) · Zbl 1432.62224
[25] Van de Wiel, M. A.; Lien, T. G.; Verlaat, W.; van Wieringen, W. N.; Wilting, S. M., “Better Prediction by Use of Co-data: Adaptive Group-Regularized Ridge Regression, Statistical Medicine, 35, 368-381 (2016)
[26] Van Houwelingen, H. C.; Bruinsma, T.; Hart, A. A.; Van ’t Veer, L. J.; Wessels, L. F., “Cross-validated Cox Regression on Microarray Gene Expression Data, Statistics in Medicine, 25, 3201-3216 (2006)
[27] Velten, B.; Huber, W., “Adaptive Penalization in High-dimensional Regression and Classification With External Covariates Using Variational Bayes, Biostatistics, 22, 348-364 (2021)
[28] Warnat-Herresthal, S.; Perrakis, K.; Taschler, B.; Becker, M.; Baßler, K.; Beyer, M.; Günther, P.; Schulte-Schrepping, J.; Seep, L.; Klee, K., “Scalable Prediction of Acute Myeloid Leukemia Using High-dimensional Machine Learning and Blood Transcriptomics, iScience, 23, 100780 (2020)
[29] Wood, S. N., “Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models, Journal of Royal Statistics Society, Series B, 73, 3-36 (2011) · Zbl 1411.62089
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.