##
**Stability.**
*(English)*
Zbl 1440.62402

Summary: Reproducibility is imperative for any scientific discovery. More often than not, modern scientific findings rely on statistical analysis of high-dimensional data. At a minimum, reproducibility manifests itself in stability of statistical results relative to “reasonable” perturbations to data and to the model used. Jacknife, bootstrap, and cross-validation are based on perturbations to data, while robust statistics methods deal with perturbations to models.

In this article, a case is made for the importance of stability in statistics. Firstly, we motivate the necessity of stability for interpretable and reliable encoding models from brain fMRI signals. Secondly, we find strong evidence in the literature to demonstrate the central role of stability in statistical inference, such as sensitivity analysis and effect detection. Thirdly, a smoothing parameter selector based on estimation stability (ES), ES-CV, is proposed for Lasso, in order to bring stability to bear on cross-validation (CV). ES-CV is then utilized in the encoding models to reduce the number of predictors by 60% with almost no loss (1.3%) of prediction performance across over 2,000 voxels. Last, a novel “stability” argument is seen to drive new results that shed light on the intriguing interactions between sample to sample variability and heavier tail error distribution (e.g., double-exponential) in high-dimensional regression models with \(p\) predictors and \(n\) independent samples. In particular, when \(p/n\rightarrow\kappa\in(0.3,1)\) and the error distribution is double-exponential, the ordinary least squares (OLS) is a better estimator than the least absolute deviation (LAD) estimator.

In this article, a case is made for the importance of stability in statistics. Firstly, we motivate the necessity of stability for interpretable and reliable encoding models from brain fMRI signals. Secondly, we find strong evidence in the literature to demonstrate the central role of stability in statistical inference, such as sensitivity analysis and effect detection. Thirdly, a smoothing parameter selector based on estimation stability (ES), ES-CV, is proposed for Lasso, in order to bring stability to bear on cross-validation (CV). ES-CV is then utilized in the encoding models to reduce the number of predictors by 60% with almost no loss (1.3%) of prediction performance across over 2,000 voxels. Last, a novel “stability” argument is seen to drive new results that shed light on the intriguing interactions between sample to sample variability and heavier tail error distribution (e.g., double-exponential) in high-dimensional regression models with \(p\) predictors and \(n\) independent samples. In particular, when \(p/n\rightarrow\kappa\in(0.3,1)\) and the error distribution is double-exponential, the ordinary least squares (OLS) is a better estimator than the least absolute deviation (LAD) estimator.

### MSC:

62R07 | Statistical aspects of big data and data science |

62F35 | Robustness and adaptive procedures (parametric inference) |

62G35 | Nonparametric robustness |

62J07 | Ridge regression; shrinkage estimators (Lasso) |

### Keywords:

cross-validation; double exponential error; estimation stability; fMRI; high-dim regression; Lasso; movie reconstruction; robust statistics; stability### References:

[1] | Allen, D.M. (1974). The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16 125-127. · Zbl 0286.62044 |

[2] | Andrews, D.F., Bickel, P.J., Hampel, F.R., Huber, P.J., Rogers, W.H. and Tukey, J.W. (1972). Robust Estimates of Location : Survey and Advances . Princeton, NJ: Princeton Univ. Press. · Zbl 0254.62001 |

[3] | Atkil, H., Martone, M.E. and Essen, D.C.V. (2012). Challenges and opportunities in mining neuroscience data. Science 331 708-712. |

[4] | Bach, F. (2008). Bolasso: Model consistent lasso estimation through the bootstrap. In Proc. of ICML . Helsinki, Finland. |

[5] | Bean, D., Bickel, P.J., El Karoui, N. and Yu, B. (2013). Optimal M-estimation in high-dimensional regression. Proc. Natl. Acad. Sci. USA . |

[6] | Beran, R. (1984). Bootstrap methods in statistics. Jahresber. Deutsch. Math.-Verein. 86 14-30. · Zbl 0547.62023 |

[7] | Bickel, P.J. (1975). One-step Huber estimates in the linear model. J. Amer. Statist. Assoc. 70 428-434. · Zbl 0322.62038 |

[8] | Bickel, P.J. and Freedman, D.A. (1981). Some asymptotic theory for the bootstrap. Ann. Statist. 9 1196-1217. · Zbl 0449.62034 |

[9] | Bickel, P.J., Götze, F. and van Zwet, W.R. (1997). Resampling fewer than \(n\) observations: Gains, losses, and remedies for losses. Statist. Sinica 7 1-31. · Zbl 0927.62043 |

[10] | Booth, B. (2012). Scientific reproducibility: Begley’s six rules. Forbes September 26. |

[11] | Bousquet, O. and Elisseeff, A. (2002). Stability and generalization. J. Mach. Learn. Res. 2 499-526. · Zbl 1007.68083 |

[12] | Breiman, L. (1996). Heuristics of instability and stabilization in model selection. Ann. Statist. 24 2350-2383. · Zbl 0867.62055 |

[13] | Carlstein, E. (1986). The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann. Statist. 14 1171-1179. · Zbl 0602.62029 |

[14] | Casadevall, A. and Fang, F.C. (2011). Reforming science: Methodological and cultural reforms. Infection and Immunity 80 891-896. |

[15] | Chatterjee, S. (2006). A generalization of the Lindeberg principle. Ann. Probab. 34 2061-2076. · Zbl 1117.60034 |

[16] | Dayan, P. and Abbott, L.F. (2005). Theoretical Neuroscience : Computational and Mathematical Modeling of Neural Systems . Cambridge, MA: MIT Press. · Zbl 1051.92010 |

[17] | Devroye, L.P. and Wagner, T.J. (1979). Distribution-free inequalities for the deleted and holdout error estimates. IEEE Trans. Inform. Theory 25 202-207. · Zbl 0408.62055 |

[18] | Donoho, D.L., Maleki, A., Shahram, M., Rahman, I.U. and Stodden, V. (2009). Reproducible research in computational harmonic analysis. IEEE Computing in Science and Engineering 11 8-18. |

[19] | Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist. 7 1-26. · Zbl 0406.62024 |

[20] | Efron, B. (1982). The Jackknife , the Bootstrap and Other Resampling Plans. CBMS-NSF Regional Conference Series in Applied Mathematics 38 . Philadelphia, PA: SIAM. · Zbl 0496.62036 |

[21] | Efron, B. and Tibshirani, R.J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability 57 . New York: Chapman & Hall. · Zbl 0835.62038 |

[22] | El Karoui, N., Bean, D., Bickel, P.J., Lim, C. and Yu, B. (2013). On robust regression with high-dimensional predictors. Proc. Natl. Acad. Sci. USA . · Zbl 1359.62184 |

[23] | Fonio, E., Golani, I. and Benjamini, Y. (2012). Measuring behavior of animal models: Faults and remedies. Nature Methods 9 1167-1170. |

[24] | Goodale, M.A. and Milner, A.D. (1992). Separate visual pathways for perception and action. Trends Neurosci. 15 20-25. |

[25] | Hall, P. (1983). Large sample optimality of least squares cross-validation in density estimation. Ann. Statist. 11 1156-1174. · Zbl 0599.62051 |

[26] | Hampel, F.R. (1968). Contributions to the theory of robust estimation. Ph.D. thesis, Univ. California, Berkeley. |

[27] | Hampel, F.R. (1971). A general qualitative definition of robustness. Ann. Math. Statist. 42 1887-1896. · Zbl 0229.62041 |

[28] | Hampel, F.R. (1974). The influence curve and its role in robust estimation. J. Amer. Statist. Assoc. 69 383-393. · Zbl 0305.62031 |

[29] | Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J. and Stahel, W.A. (1986). Robust Statistics : The Approach Based on Influence Functions. Wiley Series in Probability and Mathematical Statistics : Probability and Mathematical Statistics . New York: Wiley. · Zbl 0593.62027 |

[30] | Hartigan, J.A. (1969). Using subsample values as typical values. J. Amer. Statist. Assoc. 64 1303-1317. |

[31] | Hartigan, J.A. (1975). Necessary and sufficient conditions for asymptotic joint normality of a statistic and its subsample values. Ann. Statist. 3 573-580. · Zbl 0303.62015 |

[32] | Hinkley, D.V. (1977). Jacknifing in unbalanced situations. Technometrics 19 285-292. · Zbl 0367.62085 |

[33] | Hoerl, A.E. (1962). Application of ridge analysis to regression problems. Chemical Engineering Progress 58 54-59. |

[34] | Hoerl, A.E. and Kennard, R.W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 42 80-86. · Zbl 0202.17205 |

[35] | Hubel, D.H. and Wiesel, T.N. (1959). Receptive fields of single neurones in the cat’s striate cortex. Journal of Physiology 148 574-591. |

[36] | Huber, P.J. (1964). Robust estimation of a location parameter. Ann. Math. Statist. 35 73-101. · Zbl 0136.39805 |

[37] | Huber, P.J. (1981). Robust Statistics . New York: Wiley. · Zbl 0536.62025 |

[38] | Huber, P.J. (2002). John W. Tukey’s contributions to robust statistics. Ann. Statist. 30 1640-1648. · Zbl 1019.62028 |

[39] | Ioannidis, J.P.A. (2005). Why most published research findings are false. PLoS Med. 2 696-701. |

[40] | Kay, K.N. and Gallant, J.L. (2009). I can see what you see. Nat. Neurosci. 12 245. |

[41] | Kay, K.N., Naselaris, T., Prenger, R.J. and Gallant, J.L. (2008). Identifying natural images from human brain activity. Nature 452 352-355. |

[42] | Kearns, M. and Ron, D. (1999). Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Comput. 11 1427-1453. |

[43] | Kraft, P., Zeggini, E. and Ioannidis, J.P.A. (2009). Replication in genome-wide association studies. Statist. Sci. 24 561-573. · Zbl 1329.62429 |

[44] | Künsch, H.R. (1989). The jackknife and the bootstrap for general stationary observations. Ann. Statist. 17 1217-1241. · Zbl 0684.62035 |

[45] | Kutin, S. and Niyogi, P. (2002). Almost-everywhere algorithmic stability and generalization error. In Proc. of UAI : Uncertainty in Artificial Intelligence 18. |

[46] | Li, K.C. (1986). Asymptotic optimality of \(C_{L}\) and generalized cross-validation in ridge regression with application to spline smoothing. Ann. Statist. 14 1101-1112. · Zbl 0629.62043 |

[47] | Lim, C. and Yu, B. (2013). Estimation stability with cross-validation (ES-CV). Available at . |

[48] | Mahalanobis, P. (1946). Sample surveys of crop yields in India. Sankhyā , Series A 7 269-280. |

[49] | Markovich, N. (2007). Nonparametric Analysis of Univariate Heavy-Tailed Data : Research and Practice. Wiley Series in Probability and Statistics . Chichester: Wiley. · Zbl 1156.62027 |

[50] | Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082 |

[51] | Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 417-473. |

[52] | Miller, R.G. (1974). The jackknife-A review. Biometrika 61 1-15. · Zbl 0275.62035 |

[53] | Mukherjee, S., Niyogi, P., Poggio, T. and Rifkin, R. (2006). Learning theory: Stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Adv. Comput. Math. 25 161-193. · Zbl 1099.68693 |

[54] | Naik, G. (2011). Scientists’ elusive goal: Reproducing study results. Wall Street Journal ( Health Industry Section ) December 2. |

[55] | Naselaris, T., Prenger, R.J., Kay, K.N. and Gallant, M.O.J.L. (2009). Bayesian reconstruction of natural images from human brain activity. Neuron 63 902-915. |

[56] | Naselaris, T., Kay, K.N., Nishimoto, S. and Gallant, J.L. (2011). Encoding and decoding in fmri. Neuroimage 56 400-410. |

[57] | Nishimoto, S., Vu, A.T., Naselaris, T., Benjamini, Y., Yu, B. and Gallant, J.L. (2011). Reconstructing visual experiences from brain activity evoked by natural movies. Current Biology 21 1641-1646. |

[58] | Nosek, B.A., Spies, J.R. and Motyl, M. (2012). Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. In Proc. of CoRR . |

[59] | Olshausen, B.A. and Field, D.J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381 607-609. |

[60] | Politis, D.N. and Romano, J.P. (1992). A general theory for large sample confidence regions based on subsamples under minimal assumptions. Technical Report 399. Dept. Statistics, Stanford Univ. · Zbl 0828.62044 |

[61] | Politis, D.N., Romano, J.P. and Wolf, M. (1999). Subsampling . New York: Springer. · Zbl 0931.62035 |

[62] | Portnoy, S.L. (1977). Robust estimation in dependent situations. Ann. Statist. 5 22-43. · Zbl 0355.62047 |

[63] | Quenouille, M.H. (1949). Approximate tests of correlation in time-series. J. R. Stat. Soc. Ser. B Stat. Methodol. 11 68-84. · Zbl 0035.09201 |

[64] | Quenouille, M.H. (1956). Notes on bias in estimation. Biometrika 43 353-360. · Zbl 0074.14003 |

[65] | Shalev-Shwartz, S., Shamir, O., Srebro, N. and Sridharan, K. (2010). Learnability, stability and uniform convergence. J. Mach. Learn. Res. 11 2635-2670. · Zbl 1242.68247 |

[66] | Shao, J. (1996). Bootstrap model selection. J. Amer. Statist. Assoc. 91 655-665. · Zbl 0869.62030 |

[67] | Shao, J. and Tu, D.S. (1995). The Jackknife and Bootstrap . New York: Springer. · Zbl 0947.62501 |

[68] | Steen, R.G. (2011). Retractions in the scientific literature: Do authors deliberately commit fraud? J. Med. Ethics 37 113-117. |

[69] | Stodden, V. (2011). Trust your science? Open your data and code. AMSTATNEWS . Available at . |

[70] | Stone, M. (1974). Cross-validatory choice and assessment of statistical prediction. J. R. Stat. Soc. Ser. B Stat. Methodol. 36 111-147. · Zbl 0308.62063 |

[71] | Suidan, T. (2006). A remark on a theorem of Chatterjee and last passage percolation. J. Phys. A 39 8977-8981. · Zbl 1148.82014 |

[72] | Tao, T. (2012). Lecture notes on the central limit theorem. Available at . |

[73] | Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267-288. · Zbl 0850.62538 |

[74] | Tikhonov, A.N. (1943). On the stability of inverse problems. Doklady Akademii Nauk SSSR 39 195-198. · Zbl 0061.23308 |

[75] | Tukey, J.W. (1958). Bias and confidence in not quite large samples. Ann. Math. Statist. 29 614. |

[76] | Tukey, J.W. (1962). The future of data analysis. Ann. Math. Statist. 33 1-67. · Zbl 0107.36401 |

[77] | Wainwright, M.J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using \(\ell_{1}\)-constrained quadratic programming (Lasso). IEEE Trans. Inform. Theory 55 2183-2202. · Zbl 1367.62220 |

[78] | Wu, C.F.J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis (with discussion). Ann. Statist. 14 1261-1295. · Zbl 0618.62072 |

[79] | Zhang, P. (1993). Model selection via multifold cross validation. Ann. Statist. 21 299-313. · Zbl 0770.62053 |

[80] | Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541-2563. · Zbl 1222.62008 |

[81] | Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.