×

zbMATH — the first resource for mathematics

On combinatorial testing problems. (English) Zbl 1200.62059
Summary: We study a class of hypothesis testing problems in which, upon observing the realization of an \(n\)-dimensional Gaussian vector, one has to decide whether the vector was drawn from a standard normal distribution or, alternatively, whether there is a subset of the components belonging to a certain given class of sets whose elements have been “contaminated,” that is, have a mean different from zero. We establish some general conditions under which testing is possible and others under which testing is hopeless with a small risk. The combinatorial and geometric structure of the class of sets is shown to play a crucial role. The bounds are illustrated on various examples.

MSC:
62H15 Hypothesis testing in multivariate analysis
05C90 Applications of graph theory
62M99 Inference from stochastic processes
62F03 Parametric hypothesis testing
62F05 Asymptotic properties of parametric tests
Software:
LAS
PDF BibTeX XML Cite
Full Text: DOI arXiv
References:
[1] Aldous, D. J. (1990). The random walk construction of uniform spanning trees and uniform labelled trees. SIAM J. Discrete Math. 3 450-465. · Zbl 0717.05028 · doi:10.1137/0403039
[2] Alon, N., Krivelevich, M. and Sudakov, B. (1999). Finding a large hidden clique in a random graph. Randoms Structures Algorithms 13 457-466. · Zbl 0959.05082 · doi:10.1002/(SICI)1098-2418(199810/12)13:3/4<457::AID-RSA14>3.0.CO;2-W
[3] Arias-Castro, E., Candès, E. J., Helgason, H. and Zeitouni, O. (2008). Searching for a trail of evidence in a maze. Ann. Statist. 36 1726-1757. · Zbl 1143.62006 · doi:10.1214/07-AOS526
[4] Arias-Castro, E., Candès, E. and Durand, A. (2009). Detection of abnormal clusters in a network. Technical report, Univ. California, San Diego.
[5] Arlot, S., Blanchard, G. and Roquain, E. (2010a). Some nonasymptotic results on resampling in high dimension, I: Confidence regions. Ann. Statist. 38 51-82. · Zbl 1180.62066 · doi:10.1214/08-AOS667
[6] Arlot, S., Blanchard, G. and Roquain, E. (2010b). Some nonasymptotic results on resampling in high dimension. II. Multiple tests. Ann. Statist. 38 83-99. · Zbl 1181.62055 · doi:10.1214/08-AOS668
[7] Baraud, Y. (2002). Non-asymptotic minimax rates of testing in signal detection. Bernoulli 8 577-606. · Zbl 1007.62042
[8] Benjamini, I., Lyons, R., Peres, Y. and Schramm, O. (2001). Uniform spanning forests. Ann. Probab. 29 1-65. · Zbl 1016.60009
[9] Bhattacharyya, A. (1946). On a measure of divergence between two multinomial populations. Sankhyā 7 401-406. · Zbl 0063.00366
[10] Boucheron, S., Lugosi, G. and Massart, P. (2000). A sharp concentration inequality with applications. Random Structures Algorithms 16 277-292. · Zbl 0954.60008 · doi:10.1002/(SICI)1098-2418(200005)16:3<277::AID-RSA4>3.0.CO;2-1
[11] Broder, A. (1989). Generating random spanning trees. In 30th Annual Symposium on Foundations of Computer Science 442-447. IEEE Press, Research Triangle Park, NC.
[12] Devroye, L. and Györfi, L. (1985). Nonparametric Density Estimation: The L 1 View . Wiley, New York. · Zbl 0546.62015
[13] Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition . Springer, New York. · Zbl 0853.68150
[14] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Statist. 32 962-994. · Zbl 1092.62051 · doi:10.1214/009053604000000265
[15] Dubdashi, D. and Ranjan, D. (1998). Balls and bins: A study in negative dependence. Random Structures Algorithms 13 99-124. · Zbl 0964.60503 · doi:10.1002/(SICI)1098-2418(199809)13:2<99::AID-RSA1>3.0.CO;2-M
[16] Dudley, R. M. (1978). Central limit theorems for empirical measures. Ann. Probab. 6 899-929. · Zbl 0404.60016 · doi:10.1214/aop/1176995384
[17] Durot, C. and Rozenholc, Y. (2006). An adaptive test for zero mean. Math. Methods Statist. 15 26-60.
[18] Feder, T. and Mihail, M. (1992). Balanced matroids. In STOC’92: Proceedings of the Twenty-Fourth Annual ACM Symposium on Theory of Computing 26-38. ACM, New York.
[19] Feige, U. and Krauthgamer, R. (2000). Finding and certifying a large hidden clique in a semirandom graph. Random Structures Algorithms 16 195-208. · Zbl 0940.05050 · doi:10.1002/(SICI)1098-2418(200003)16:2<195::AID-RSA5>3.0.CO;2-A
[20] Glaz, J., Naus, J. and Wallenstein, S. (2001). Scan Statistics . Springer, New York. · Zbl 0983.62075
[21] Grimmett, G. R. and Winkler, S. N. (2004). Negative association in uniform forests and connected graphs. Random Structures Algorithms 24 444-460. · Zbl 1048.60007 · doi:10.1002/rsa.20012
[22] Haussler, D. (1995). Sphere packing numbers for subsets of the boolean n -cube with bounded Vapnik-Chervonenkis dimension. J. Combin. Theory Ser. A 69 217-232. · Zbl 0818.60005 · doi:10.1016/0097-3165(95)90052-7
[23] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13-30. JSTOR: · Zbl 0127.10602 · doi:10.2307/2282952 · links.jstor.org
[24] Ingster, Y. I. (1999). Minimax detection of a signal for l p n -balls. Math. Methods Statist. 7 401-428. · Zbl 1103.62312
[25] Jerrum, M., Sinclair, A. and Vigoda, E. (2004). A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries. J. ACM 51 671-697. · Zbl 1204.65044 · doi:10.1145/1008731.1008738
[26] Le Cam, L. (1970). On the assumptions used to prove asymptotic normality of maximum likelihood estimates. Ann. Math. Statist. 41 802-828. · Zbl 0246.62039 · doi:10.1214/aoms/1177696960
[27] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces . Springer, New York. · Zbl 0748.60004
[28] Moon, J. W. (1970). Counting Labelled Trees. Canadian Mathematical Monographs 1 . Canadian Mathematical Congress, Montreal. · Zbl 0214.23204
[29] Propp, J. G. and Wilson, D. B. (1998). How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph. J. Algorithms 27 170-217. · Zbl 0919.68092 · doi:10.1006/jagm.1997.0917
[30] Romano, J. P. and Wolf, M. (2005). Exact and approximate stepdown methods for multiple hypothesis testing. J. Amer. Statist. Assoc. 100 94-108. · Zbl 1117.62416 · doi:10.1198/016214504000000539 · masetto.asa.catchword.org
[31] Shabalin, A. A., Weigman, V. J., Perou, C. M. and Nobel, A. B. (2009). Finding large average submatrices in high dimensional data. Ann. Appl. Statist. 3 985-1012. · Zbl 1196.62087 · doi:10.1214/09-AOAS239
[32] Slepian, D. (1962). The one-sided barrier problem for Gaussian noise. Bell System Tech. J. 41 463-501.
[33] Talagrand, M. (2005). The Generic Chaining . Springer, New York. · Zbl 1075.60001
[34] Tsirelson, B. S., Ibragimov, I. A. and Sudakov, V. N. (1976). Norm of Gaussian sample function. In Proceedings of the 3rd Japan-U.S.S.R. Symposium on Probability Theory. Lecture Notes in Math. 550 20-41. Springer, Berlin. · Zbl 0359.60019
[35] Vapnik, V. N. and Chervonenkis, A. Y. (1971). On the uniform convergence of relative frequences of events to their probabilities. Theory Probab. Appl. 16 264-280. · Zbl 0247.60005 · doi:10.1137/1116025
[36] Vonnegut, K. (1973). Breakfast of Champions . Delacorte Press, New York.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.