×

Detection thresholds for the \(\beta\)-model on sparse graphs. (English) Zbl 1392.62131

Summary: In this paper, we study sharp thresholds for detecting sparse signals in \(\beta\)-models for potentially sparse random graphs. The results demonstrate interesting interplay between graph sparsity, signal sparsity and signal strength. In regimes of moderately dense signals, irrespective of graph sparsity, the detection thresholds mirror corresponding results in independent Gaussian sequence problems. For sparser signals, extreme graph sparsity implies that all tests are asymptotically powerless, irrespective of the signal strength. On the other hand, sharp detection thresholds are obtained, up to matching constants, on denser graphs. The phase transitions mentioned above are sharp. As a crucial ingredient, we study a version of the higher criticism test which is provably sharp up to optimal constants in the regime of sparse signals. The theoretical results are further verified by numerical simulations.

MSC:

62G10 Nonparametric hypothesis testing
05C80 Random graphs (graph-theoretic aspects)
62G20 Asymptotic properties of nonparametric inference
62C20 Minimax procedures in statistical decision theory
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] Addario-Berry, L., Broutin, N., Devroye, L. and Lugosi, G. (2010). On combinatorial testing problems. Ann. Statist.38 3063-3092. · Zbl 1200.62059
[2] Arias-Castro, E., Candès, E. J. and Plan, Y. (2011). Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism. Ann. Statist.39 2533-2556. · Zbl 1231.62136
[3] Arias-Castro, E., Donoho, D. L. and Huo, X. (2005). Near-optimal detection of geometric objects by fast multiscale methods. IEEE Trans. Inform. Theory 51 2402-2425. · Zbl 1282.94014
[4] Arias-Castro, E. and Verzelen, N. (2013). Community detection in random networks. Available at arXiv:1302.7099.
[5] Arias-Castro, E. and Wang, M. (2015). The sparse Poisson means model. Electron. J. Stat.9 2170-2201. · Zbl 1337.62088
[6] Arias-Castro, E., Candès, E. J., Helgason, H. and Zeitouni, O. (2008). Searching for a trail of evidence in a maze. Ann. Statist.36 1726-1757. · Zbl 1143.62006
[7] Barabási, A.-L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286 509-512. · Zbl 1226.05223
[8] Barnett, I., Mukherjee, R. and Lin, X. (2017). The generalized higher criticism for testing SNP-set effects in genetic association studies. J. Amer. Statist. Assoc.112 64-76.
[9] Barvinok, A. and Hartigan, J. A. (2013). The number of graphs and a random graph with a given degree sequence. Random Structures Algorithms 42 301-348. · Zbl 1264.05125
[10] Bickel, P. J., Chen, A. and Levina, E. (2011). The method of moments and degree distributions for network models. Ann. Statist.39 2280-2301. · Zbl 1232.91577
[11] Blitzstein, J. and Diaconis, P. (2010). A sequential importance sampling algorithm for generating random graphs with prescribed degrees. Internet Math.6 489-522. · Zbl 1238.60084
[12] Bollobás, B. (2001). Random Graphs, 2nd ed. Cambridge Studies in Advanced Mathematics 73. Cambridge Univ. Press, Cambridge.
[13] Cai, T. T. and Yuan, M. (2014). Rate-optimal detection of very short signal segments. Available at arXiv:1407.2812.
[14] Chatterjee, S., Diaconis, P. and Sly, A. (2011). Random graphs with a given degree sequence. Ann. Appl. Probab.21 1400-1435. · Zbl 1234.05206
[15] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Statist.32 962-994. · Zbl 1092.62051
[16] Fienberg, S. E. and Wasserman, S. (1981). Categorical data analysis of single sociometric relations. Sociol. Method.12 156-192.
[17] Goodreau, S. M. (2007). Advances in exponential random graph (p∗) models applied to a large social network. Soc. Netw.29 231-248.
[18] Hall, P. and Jin, J. (2010). Innovated higher criticism for detecting sparse signals in correlated noise. Ann. Statist.38 1686-1732. · Zbl 1189.62080
[19] Hara, H. and Takemura, A. (2010). Connecting tables with zero-one entries by a subset of a Markov basis. In Algebraic Methods in Statistics and Probability II. Contemp. Math.516 199-213. Amer. Math. Soc., Providence, RI. · Zbl 1196.62073
[20] Hillar, C. and Wibisono, A. (2013). Maximum entropy distributions on graphs. Available at arXiv:1301.3321.
[21] Holland, P. W. and Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. J. Amer. Statist. Assoc.76 33-65. · Zbl 0457.62090
[22] Ingster, Y. I. and Suslina, I. A. (2003). Nonparametric Goodness-of-Fit Testing Under Gaussian Models. Lecture Notes in Statistics 169. Springer, New York. · Zbl 1013.62049
[23] Ingster, Y. I., Tsybakov, A. B. and Verzelen, N. (2010). Detection boundary in sparse regression. Electron. J. Stat.4 1476-1526. · Zbl 1329.62314
[24] Karrer, B. and Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107.
[25] Karwa, V. and Slavković, A. (2016). Inference using noisy degrees: Differentially private \(β\)-model and synthetic graphs. Ann. Statist.44 87-112. · Zbl 1331.62114
[26] Lauritzen, S. L. (2002). Rasch models with exchangeable rows and columns. Research Report Series, No. R-02-2005, Dept. Mathematical Sciences, Aalborg Univ.
[27] Lauritzen, S. L. (2008). Exchangeable Rasch matrices. Rend. Mat. Appl. (7) 28 83-95. · Zbl 1223.60008
[28] Mukherjee, R., Mukherjee, S. and Sen, S. (2018). Supplement to “Detection thresholds for the \(β\)-model on sparse graphs.” DOI:10.1214/17-AOS1585SUPP.
[29] Mukherjee, R., Pillai, N. S. and Lin, X. (2015). Hypothesis testing for high-dimensional sparse binary regression. Ann. Statist.43 352-381. · Zbl 1308.62094
[30] Ogawa, M., Hara, H. and Takemura, A. (2013). Graver basis for an undirected graph and its application to testing the beta model of random graphs. Ann. Inst. Statist. Math.65 191-212. · Zbl 1440.13114
[31] Perry, P. O. and Wolfe, P. J. (2012). Null models for network data. Available at arXiv:1201.5871.
[32] Petrović, S., Rinaldo, A. and Fienberg, S. E. (2010). Algebraic statistics for a directed random graph model with reciprocation. In Algebraic Methods in Statistics and Probability II. Contemp. Math.516 261-283. Amer. Math. Soc., Providence, RI. · Zbl 1197.13025
[33] Rinaldo, A., Petrović, S. and Fienberg, S. E. (2013). Maximum likelihood estimation in the \(β\)-model. Ann. Statist.41 1085-1110. · Zbl 1292.62052
[34] Robins, G., Pattison, P., Kalish, Y. and Lusher, D. (2007). An introduction to exponential random graph (p∗) models for social networks. Soc. Netw.29 173-191.
[35] Verzelen, N. and Arias-Castro, E. (2015). Community detection in sparse random networks. Ann. Appl. Probab.25 3465-3510. · Zbl 1326.05145
[36] Watts, D. J. and Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature 393 440-442. · Zbl 1368.05139
[37] Yan, T., Qin, H. and Wang, H. (2016). Asymptotics in undirected random graph models parameterized by the strengths of vertices. Statist. Sinica 26 273-293. · Zbl 1419.62046
[38] Yan, T. and Xu, J. (2013). A central limit theorem in the \(β\)-model for undirected random graphs with a diverging number of vertices. Biometrika 100 519-524. · Zbl 1452.62214
[39] Yan, T., Zhao, Y. and Qin, H. (2015). Asymptotic normality in the maximum entropy models on graphs with an increasing number of parameters. J. Multivariate Anal.133 61-76. · Zbl 1304.62038
[40] Yan, X.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.