Selective inference with a randomized response. (English) Zbl 1392.62144

The authors investigate general selective inference with randomized response in model selection. They introduce the selective likelihood ratio and a framework of asymptotic analysis for selective models. The problems of consistent estimation and week convergence are studied. It is shown that randomization selection schemes lead to an increase in test powers. Linear regression models are used as examples. It is shown that the central limit theorem holds true under mild conditions on the selective distributions. This makes it possible to develop asymptotic selective inference in nonparametric settings. Finally, the paper discusses two extensions to multiple randomized selections: selective inference after cross-validation for LASSO and collaborative selective inference. Some additional sampling schemes and all proofs are given in the supplementary materials.


62G99 Nonparametric inference
62J07 Ridge regression; shrinkage estimators (Lasso)
62J05 Linear regression; mixed models


tmg; covTest
Full Text: DOI arXiv


[1] Bahadur, R. R. (1966). A note on quantiles in large samples. Ann. Math. Statist.37 577-580. · Zbl 0147.18805
[2] Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98 791-806. · Zbl 1228.62083
[3] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289-300. · Zbl 0809.62014
[4] Benjamini, Y. and Stark, P. B. (1996). Nonequivariant simultaneous confidence intervals less likely to contain zero. J. Amer. Statist. Assoc.91 329-337. · Zbl 0871.62027
[5] Bühlmann, P. (2013). Statistical significance in high-dimensional linear models. Bernoulli 19 1212-1242. · Zbl 1273.62173
[6] Chatterjee, S. (2005). A simple invariance theorem. Preprint. Available at arXiv:math/0508213.
[7] Chung, E. and Romano, J. P. (2013). Exact and asymptotically robust permutation tests. Ann. Statist.41 484-507. · Zbl 1267.62064
[8] Cox, D. R. (1975). A note on data-splitting for the evaluation of significance levels. Biometrika 62 441-444. · Zbl 0309.62014
[9] Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O. and Roth, A. (2015). Preserving statistical validity in adaptive data analysis [extended abstract]. In STOC’15—Proceedings of the 2015 ACM Symposium on Theory of Computing 117-126. ACM, New York. · Zbl 1321.68401
[10] Fithian, W., Sun, D. and Taylor, J. (2014). Optimal inference after model selection. Available at arXiv:1410.2597.
[11] Fithian, W., Taylor, J., Tibshirani, R. and Tibshirani, R., (2015). Selective sequential model selection. Available at arXiv:1512.02565. · Zbl 1411.62212
[12] Götze, F. (1991). On the rate of convergence in the multivariate CLT. Ann. Probab.19 724-739. · Zbl 0729.62051
[13] Harris, X. T. (2016). Prediction error after model search. Preprint. Available at arXiv:1610.06107.
[14] Harris, X. T., Panigrahi, S., Markovic, J., Bi, N. and Taylor, J. (2016). Selective sampling after solving a convex problem. Preprint. Available at arXiv:1609.05609.
[15] Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression. J. Mach. Learn. Res.15 2869-2909. · Zbl 1319.62145
[16] Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. (2016). Exact post-selection inference, with application to the lasso. Ann. Statist.44 907-927. · Zbl 1341.62061
[17] Lehmann, E. L. (1986). Testing Statistical Hypotheses, 2nd ed. Wiley, New York. · Zbl 0608.62020
[18] Lockhart, R., Taylor, J., Tibshirani, R. J. and Tibshirani, R. (2014). A significance test for the lasso. Ann. Statist.42 413-468. · Zbl 1305.62254
[19] Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol.72 417-473. · Zbl 1411.62142
[20] Meinshausen, N., Meier, L. and Bühlmann, P. (2009). \(p\)-values for high-dimensional regression. J. Amer. Statist. Assoc.104 1671-1681. · Zbl 1205.62089
[21] Pakman, A. and Paninski, L. (2014). Exact Hamiltonian Monte Carlo for truncated multivariate Gaussians. J. Comput. Graph. Statist.23 518-542.
[22] Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychol. Bull.86 638.
[23] Sun, T. and Zhang, C.-H. (2012). Scaled sparse linear regression. Biometrika 99 879-898. · Zbl 1452.62515
[24] Tian, X., Bi, N. and Taylor, J. (2016). Magic: A general, powerful and tractable method for selective inference. Preprint. Available at arXiv:1607.02630.
[25] Tian, X., Loftus, J. R. and Taylor, J. E. (2015). Selective inference with unknown variance via the square-root lasso. Preprint. Available at arXiv:1504.08031. · Zbl 06994533
[26] Tian, X. and Taylor, J. (2015). Asymptotics of selective inference. Available at arXiv:1501.03588. · Zbl 1422.62252
[27] Tian, X. and Taylor, J. (2018). Supplement to “Selective inference with a randomized response.” DOI:10.1214/17-AOS1564SUPP.
[28] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[29] Tibshirani, R. J., Rinaldo, A., Tibshirani, R. and Wasserman, L. (2015). Uniform asymptotic inference and the bootstrap after model selection. Preprint. Available at arXiv:1506.06266. · Zbl 1392.62210
[30] Tibshirani, R. J., Taylor, J., Lockhart, R. and Tibshirani, R. (2016). Exact post-selection inference for sequential regression procedures. J. Amer. Statist. Assoc.111 600-620.
[31] Tukey, J. W. (1980). We need both exploratory and confirmatory. Amer. Statist.34 23-25.
[32] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist.42 1166-1202. · Zbl 1305.62259
[33] Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection. Ann. Statist.37 2178-2201. · Zbl 1173.62054
[34] Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. Ser. B. Stat. Methodol.76 217-242. · Zbl 1411.62196
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.