Gao, Chao; van der Vaart, Aad W.; Zhou, Harrison H. A general framework for Bayes structured linear models. (English) Zbl 1471.62241 Ann. Stat. 48, No. 5, 2848-2878 (2020). The paper under review provides a unified methodology and theory for both Bayes high dimensional statistics and Bayes nonparametric statistics in a general framework of structured linear models. The authors first introduce a unified view of various high dimensional and nonparametric models, and then propose a single prior distribution for all models in the considered framework. Optimal rates of convergence of the posterior distributions are established under appropriate conditions. The results directly lead to exact minimax posterior contraction rates in stochastic block model, biclustering, sparse linear regression, regression with group sparsity, multitask learning and dictionary learning. Moreover, a general posterior oracle inequality, which allows arbitrary model misspecification, is also derived. The main results are illustrated by examples ranging from nonparametric estimation to high dimensional statistics. Reviewer: Joseph Melamed (Los Angeles) Cited in 17 Documents MSC: 62C10 Bayesian problems; characterization of Bayes procedures 62F15 Bayesian inference 62G05 Nonparametric estimation 62J05 Linear regression; mixed models Keywords:oracle inequality; stochastic block model; graphon; sparse linear regression; aggregation; dictionary learning; posterior contraction PDFBibTeX XMLCite \textit{C. Gao} et al., Ann. Stat. 48, No. 5, 2848--2878 (2020; Zbl 1471.62241) Full Text: DOI arXiv Euclid References: [1] Agarwal, A., Anandkumar, A. and Netrapalli, P. (2013). Exact recovery of sparsely used overcomplete dictionaries. ArXiv preprint. Available at arXiv:1309.1952. arXiv: 1309.1952 Zentralblatt MATH: 1359.62229 Digital Object Identifier: doi:10.1109/TIT.2016.2614684 · Zbl 1359.62229 · doi:10.1109/TIT.2016.2614684 [2] Aldous, D. J. (1981). Representations for partially exchangeable arrays of random variables. J. Multivariate Anal. 11 581-598. Zentralblatt MATH: 0474.60044 Digital Object Identifier: doi:10.1016/0047-259X(81)90099-3 · Zbl 0474.60044 · doi:10.1016/0047-259X(81)90099-3 [3] Bakin, S. (1999). Adaptive regression and model selection in data mining problems. [4] Banerjee, S. and Ghosal, S. (2014). Posterior convergence rates for estimating large precision matrices using graphical models. Electron. J. Stat. 8 2111-2137. Zentralblatt MATH: 1302.62124 Digital Object Identifier: doi:10.1214/14-EJS945 · Zbl 1302.62124 · doi:10.1214/14-EJS945 [5] Barron, A., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization. Probab. Theory Related Fields 113 301-413. Zentralblatt MATH: 0946.62036 Digital Object Identifier: doi:10.1007/s004400050210 · Zbl 0946.62036 · doi:10.1007/s004400050210 [6] Barron, A., Schervish, M. J. and Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems. Ann. Statist. 27 536-561. Zentralblatt MATH: 0980.62039 Digital Object Identifier: doi:10.1214/aos/1017939142 Project Euclid: euclid.aos/1018031206 · Zbl 0980.62039 · doi:10.1214/aos/1017939142 [7] Barron, A. R. (1988). The exponential convergence of posterior probabilities with implications for Bayes estimators of density functions. Univ. of Illinois. [8] Barron, A. R. and Cover, T. M. (1991). Minimum complexity density estimation. IEEE Trans. Inform. Theory 37 1034-1054. Zentralblatt MATH: 0743.62003 Digital Object Identifier: doi:10.1109/18.86996 · Zbl 0743.62003 · doi:10.1109/18.86996 [9] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705-1732. Zentralblatt MATH: 1173.62022 Digital Object Identifier: doi:10.1214/08-AOS620 Project Euclid: euclid.aos/1245332830 · Zbl 1173.62022 · doi:10.1214/08-AOS620 [10] Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. (JEMS) 3 203-268. · Zbl 1037.62001 [11] Brown, L. D. and Low, M. G. (1996). Asymptotic equivalence of nonparametric regression and white noise. Ann. Statist. 24 2384-2398. Zentralblatt MATH: 0867.62022 Digital Object Identifier: doi:10.1214/aos/1032181159 Project Euclid: euclid.aos/1032181159 · Zbl 0867.62022 · doi:10.1214/aos/1032181159 [12] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Series in Statistics. Springer, Heidelberg. · Zbl 1273.62015 [13] Bunea, F. (2008). Consistent selection via the Lasso for high dimensional approximating regression models. In Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh. Inst. Math. Stat. (IMS) Collect. 3 122-137. IMS, Beachwood, OH. · Zbl 1159.62004 [14] Candes, E. J. and Tao, T. (2005). Decoding by linear programming. IEEE Trans. Inform. Theory 51 4203-4215. Zentralblatt MATH: 1264.94121 Digital Object Identifier: doi:10.1109/TIT.2005.858979 · Zbl 1264.94121 · doi:10.1109/TIT.2005.858979 [15] Castillo, I. (2014). On Bayesian supremum norm contraction rates. Ann. Statist. 42 2058-2091. Zentralblatt MATH: 1305.62189 Digital Object Identifier: doi:10.1214/14-AOS1253 Project Euclid: euclid.aos/1410440634 · Zbl 1305.62189 · doi:10.1214/14-AOS1253 [16] Castillo, I., Schmidt-Hieber, J. and van der Vaart, A. (2015). Bayesian linear regression with sparse priors. Ann. Statist. 43 1986-2018. Zentralblatt MATH: 06502640 Digital Object Identifier: doi:10.1214/15-AOS1334 Project Euclid: euclid.aos/1438606851 · Zbl 1486.62197 · doi:10.1214/15-AOS1334 [17] Castillo, I. and van der Vaart, A. (2012). Needles and straw in a haystack: Posterior concentration for possibly sparse sequences. Ann. Statist. 40 2069-2101. Zentralblatt MATH: 1257.62025 Digital Object Identifier: doi:10.1214/12-AOS1029 Project Euclid: euclid.aos/1351602537 · Zbl 1257.62025 · doi:10.1214/12-AOS1029 [18] Catoni, O. (2004). Statistical Learning Theory and Stochastic Optimization. Lecture Notes in Math. 1851. Springer, Berlin. Zentralblatt MATH: 1076.93002 · Zbl 1076.93002 [19] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33-61. Zentralblatt MATH: 1162.60009 · Zbl 1162.60009 [20] Donoho, D. L., Elad, M. and Temlyakov, V. N. (2006). Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inform. Theory 52 6-18. Zentralblatt MATH: 1288.94017 Digital Object Identifier: doi:10.1109/TIT.2005.860430 · Zbl 1288.94017 · doi:10.1109/TIT.2005.860430 [21] Fang, K. T., Kotz, S. and Ng, K. W. (1990). Symmetric Multivariate and Related Distributions. Monographs on Statistics and Applied Probability 36. CRC Press, London. Zentralblatt MATH: 0699.62048 · Zbl 0699.62048 [22] Gao, C., Lu, Y. and Zhou, H. H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624-2652. Zentralblatt MATH: 1332.60050 Digital Object Identifier: doi:10.1214/15-AOS1354 Project Euclid: euclid.aos/1444222087 · Zbl 1332.60050 · doi:10.1214/15-AOS1354 [23] Gao, C., van der Vaart, A. W and Zhou, H. H (2020). Supplement to “A General Framework for Bayes Structured Linear Models.” https://doi.org/10.1214/19-AOS1909SUPP. [24] Gao, C. and Zhou, H. H. (2015). Rate-optimal posterior contraction for sparse PCA. Ann. Statist. 43 785-818. Zentralblatt MATH: 1312.62078 Digital Object Identifier: doi:10.1214/14-AOS1268 Project Euclid: euclid.aos/1427115287 · Zbl 1312.62078 · doi:10.1214/14-AOS1268 [25] Gao, C. and Zhou, H. H. (2016). Rate exact Bayesian adaptation with modified block priors. Ann. Statist. 44 318-345. Zentralblatt MATH: 1331.62215 Digital Object Identifier: doi:10.1214/15-AOS1368 Project Euclid: euclid.aos/1449755965 · Zbl 1331.62215 · doi:10.1214/15-AOS1368 [26] Ghosal, S., Ghosh, J. K. and Ramamoorthi, R. V. (1999). Posterior consistency of Dirichlet mixtures in density estimation. Ann. Statist. 27 143-158. Zentralblatt MATH: 0932.62043 Digital Object Identifier: doi:10.1214/aos/1018031105 Project Euclid: euclid.aos/1018031105 · Zbl 0932.62043 · doi:10.1214/aos/1018031105 [27] Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions. Ann. Statist. 28 500-531. Zentralblatt MATH: 1105.62315 Digital Object Identifier: doi:10.1214/aos/1016218228 Project Euclid: euclid.aos/1016218228 · Zbl 1105.62315 · doi:10.1214/aos/1016218228 [28] Hartigan, J. A. (1972). Direct clustering of a data matrix. J. Amer. Statist. Assoc. 67 123-129. [29] Hoffmann, M., Rousseau, J. and Schmidt-Hieber, J. (2015). On adaptive posterior concentration rates. Ann. Statist. 43 2259-2295. Zentralblatt MATH: 1327.62306 Digital Object Identifier: doi:10.1214/15-AOS1341 Project Euclid: euclid.aos/1442364152 · Zbl 1327.62306 · doi:10.1214/15-AOS1341 [30] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109-137. [31] Hoover, D. N. (1979). Relations on probability spaces and arrays of random variables 2. Institute for Advanced Study, Princeton, NJ. Preprint. [32] Johnstone, I. M. (2017). Gaussian estimation: Sequence and wavelet models. [33] Kallenberg, O. (1989). On the representation theorem for exchangeable arrays. J. Multivariate Anal. 30 137-154. Zentralblatt MATH: 0676.60046 Digital Object Identifier: doi:10.1016/0047-259X(89)90092-4 · Zbl 0676.60046 · doi:10.1016/0047-259X(89)90092-4 [34] Kleijn, B. J. K. and van der Vaart, A. W. (2006). Misspecification in infinite-dimensional Bayesian statistics. Ann. Statist. 34 837-877. Zentralblatt MATH: 1095.62031 Digital Object Identifier: doi:10.1214/009053606000000029 Project Euclid: euclid.aos/1151418243 · Zbl 1095.62031 · doi:10.1214/009053606000000029 [35] Klopp, O., Lu, Y., Tsybakov, A. B. and Zhou, H. H. (2019). Structured matrix estimation and completion. Bernoulli 25 3883-3911. Zentralblatt MATH: 1428.62281 Digital Object Identifier: doi:10.3150/19-BEJ1114 Project Euclid: euclid.bj/1569398788 · Zbl 1428.62281 · doi:10.3150/19-BEJ1114 [36] Leung, G. and Barron, A. R. (2006). Information theory and mixing least-squares regressions. IEEE Trans. Inform. Theory 52 3396-3410. Zentralblatt MATH: 1309.94051 Digital Object Identifier: doi:10.1109/TIT.2006.878172 · Zbl 1309.94051 · doi:10.1109/TIT.2006.878172 [37] Lounici, K. (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 90-102. Zentralblatt MATH: 1306.62155 Digital Object Identifier: doi:10.1214/08-EJS177 · Zbl 1306.62155 · doi:10.1214/08-EJS177 [38] Lounici, K., Pontil, M., van de Geer, S. and Tsybakov, A. B. (2011). Oracle inequalities and optimal inference under group sparsity. Ann. Statist. 39 2164-2204. Zentralblatt MATH: 1306.62156 Digital Object Identifier: doi:10.1214/11-AOS896 Project Euclid: euclid.aos/1319595462 · Zbl 1306.62156 · doi:10.1214/11-AOS896 [39] Lovász, L. (2012). Large Networks and Graph Limits. American Mathematical Society Colloquium Publications 60. Amer. Math. Soc., Providence, RI. · Zbl 1292.05001 [40] Lovász, L. and Szegedy, B. (2006). Limits of dense graph sequences. J. Combin. Theory Ser. B 96 933-957. Zentralblatt MATH: 1113.05092 Digital Object Identifier: doi:10.1016/j.jctb.2006.05.002 · Zbl 1113.05092 · doi:10.1016/j.jctb.2006.05.002 [41] Ma, Z. and Wu, Y. (2015). Volume ratio, sparsity, and minimaxity under unitarily invariant norms. IEEE Trans. Inform. Theory 61 6939-6956. Zentralblatt MATH: 1359.94135 Digital Object Identifier: doi:10.1109/TIT.2015.2487541 · Zbl 1359.94135 · doi:10.1109/TIT.2015.2487541 [42] Martin, R., Mess, R. and Walker, S. G. (2017). Empirical Bayes posterior concentration in sparse high-dimensional linear models. Bernoulli 23 1822-1847. Zentralblatt MATH: 06714320 Digital Object Identifier: doi:10.3150/15-BEJ797 Project Euclid: euclid.bj/1489737626 · Zbl 1450.62085 · doi:10.3150/15-BEJ797 [43] Nemirovski, A. (2000). Topics in non-parametric statistics. In Lectures on Probability Theory and Statistics (Saint-Flour, 1998). Lecture Notes in Math. 1738 85-277. Springer, Berlin. Zentralblatt MATH: 0998.62033 · Zbl 0998.62033 [44] Nussbaum, M. (1996). Asymptotic equivalence of density estimation and Gaussian white noise. Ann. Statist. 24 2399-2430. Zentralblatt MATH: 0867.62035 Digital Object Identifier: doi:10.1214/aos/1032181160 Project Euclid: euclid.aos/1032181160 · Zbl 0867.62035 · doi:10.1214/aos/1032181160 [45] Olshausen, B. A. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381 607-609. [46] Pati, D. and Bhattacharya, A. (2015). Optimal Bayesian estimation in stochastic block models. ArXiv preprint. Available at arXiv:1505.06794. arXiv: 1505.06794 Zentralblatt MATH: 1328.62178 Digital Object Identifier: doi:10.1016/j.spl.2015.04.012 · Zbl 1328.62178 · doi:10.1016/j.spl.2015.04.012 [47] Raskutti, G., Wainwright, M. J. and Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over \(\ell_q\)-balls. IEEE Trans. Inform. Theory 57 6976-6994. Zentralblatt MATH: 1365.62276 Digital Object Identifier: doi:10.1109/TIT.2011.2165799 · Zbl 1365.62276 · doi:10.1109/TIT.2011.2165799 [48] Rigollet, P. and Tsybakov, A. (2011). Exponential screening and optimal rates of sparse estimation. Ann. Statist. 39 731-771. Zentralblatt MATH: 1215.62043 Digital Object Identifier: doi:10.1214/10-AOS854 Project Euclid: euclid.aos/1299680953 · Zbl 1215.62043 · doi:10.1214/10-AOS854 [49] Rigollet, P. and Tsybakov, A. B. (2012). Sparse estimation by exponential weighting. Statist. Sci. 27 558-575. Zentralblatt MATH: 1331.62351 Digital Object Identifier: doi:10.1214/12-STS393 Project Euclid: euclid.ss/1356098556 · Zbl 1331.62351 · doi:10.1214/12-STS393 [50] Rivoirard, V. and Rousseau, J. (2012). Posterior concentration rates for infinite dimensional exponential families. Bayesian Anal. 7 311-333. Zentralblatt MATH: 1330.62179 Digital Object Identifier: doi:10.1214/12-BA710 · Zbl 1330.62179 · doi:10.1214/12-BA710 [51] Shen, X. and Wasserman, L. (2001). Rates of convergence of posterior distributions. Ann. Statist. 29 687-714. Zentralblatt MATH: 1041.62022 Digital Object Identifier: doi:10.1214/aos/1009210686 Project Euclid: euclid.aos/1009210686 · Zbl 1041.62022 · doi:10.1214/aos/1009210686 [52] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. Zentralblatt MATH: 0850.62538 Digital Object Identifier: doi:10.1111/j.2517-6161.1996.tb02080.x · Zbl 0850.62538 · doi:10.1111/j.2517-6161.1996.tb02080.x [53] Tsybakov, A. B. (2003). Optimal rates of aggregation. In Learning Theory and Kernel Machines 303-313. Springer, Berlin. Zentralblatt MATH: 1208.62073 · Zbl 1208.62073 [54] Tsybakov, A. B. (2014). Aggregation and minimax optimality in high-dimensional estimation. In Proceedings of the International Congress of Mathematicians—Seoul 2014. Vol. IV 225-246. Kyung Moon Sa, Seoul. Zentralblatt MATH: 1380.62136 · Zbl 1380.62136 [55] van de Geer, S. A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3 1360-1392. Zentralblatt MATH: 1327.62425 Digital Object Identifier: doi:10.1214/09-EJS506 · Zbl 1327.62425 · doi:10.1214/09-EJS506 [56] van der Pas, S. L. and van der Vaart, A. W. (2018). Bayesian community detection. Bayesian Anal. 13 767-796. Zentralblatt MATH: 1407.62240 Digital Object Identifier: doi:10.1214/17-BA1078 · Zbl 1407.62240 · doi:10.1214/17-BA1078 [57] van der Vaart, A. W. and van Zanten, J. H. (2008). Rates of contraction of posterior distributions based on Gaussian process priors. Ann. Statist. 36 1435-1463. Zentralblatt MATH: 1141.60018 Digital Object Identifier: doi:10.1214/009053607000000613 Project Euclid: euclid.aos/1211819570 · Zbl 1141.60018 · doi:10.1214/009053607000000613 [58] Wang, Z., Paterlini, S., Gao, F. and Yang, Y. (2014). Adaptive minimax regression estimation over sparse \(\ell_q\)-hulls. J. Mach. Learn. Res. 15 1675-1711. Zentralblatt MATH: 1319.62016 · Zbl 1319.62016 [59] Yang, Y. (1999). Model selection for nonparametric regression. Statist. Sinica 9 475-499. Zentralblatt MATH: 0921.62051 · Zbl 0921.62051 [60] Yang, Y. (2000). Combining different procedures for adaptive regression. J. Multivariate Anal. 74 135-161. Zentralblatt MATH: 0964.62032 Digital Object Identifier: doi:10.1006/jmva.1999.1884 · Zbl 0964.62032 · doi:10.1006/jmva.1999.1884 [61] Yang, Y. (2004). Aggregating regression procedures to improve performance. Bernoulli 10 25-47. Zentralblatt MATH: 1040.62030 Digital Object Identifier: doi:10.3150/bj/1077544602 Project Euclid: euclid.bj/1077544602 · Zbl 1040.62030 · doi:10.3150/bj/1077544602 [62] Yang, Y. and Barron, A. R. (1998). An asymptotic property of model selection criteria. IEEE Trans. Inform. Theory 44 95-116. Zentralblatt MATH: 0949.62041 Digital Object Identifier: doi:10.1109/18.650993 · Zbl 0949.62041 · doi:10.1109/18.650993 [63] Yang, Y. and Dunson, D. B. (2014). Minimax optimal bayesian aggregation. ArXiv preprint. Available at arXiv:1403.1345. arXiv: 1403.1345 [64] Yang, Y., Wainwright, M. J. and Jordan, M. I. (2016). On the computational complexity of high-dimensional Bayesian variable selection. Ann. Statist. 44 2497-2532. Zentralblatt MATH: 1359.62088 Digital Object Identifier: doi:10.1214/15-AOS1417 Project Euclid: euclid.aos/1479891626 · Zbl 1359.62088 · doi:10.1214/15-AOS1417 [65] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 49-67. Zentralblatt MATH: 1141.62030 Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00532.x · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.