##
**Higher criticism for detecting sparse heterogeneous mixtures.**
*(English)*
Zbl 1092.62051

Summary: Higher criticism, or second-level significance testing, is a multiple-comparisons concept mentioned in passing by J. W. Tukey [The higher criticism. Course Notes Stat. 411, Princeton Univ. (1976)]. It concerns a situation where there are many independent tests of significance and one is interested in rejecting the joint null hypothesis. Tukey suggested comparing the fraction of observed significances at a given a-level to the expected fraction under the joint null. In fact, he suggested standardizing the difference of the two quantities and forming a z-score; the resulting z-score tests the significance of the body of significance tests. We consider a generalization, where we maximize this z-score over a range of significance levels \(0<a\leq\infty\). We are able to show that the resulting higher criticism statistic is effective at resolving a very subtle testing problem: testing whether \(n\) normal means are all zero versus the alternative that a small fraction is nonzero.

The subtlety of this “sparse normal means” testing problem can be seen from work of Y. I. Ingster [Math. Methods. Stat. 6, 47–69 (1997; Zbl 0878.62005)] and J. Jin [Detection boundary for sparse mixtures. Unpubl. manuscript. (2002)], who studied such problems in great detail. In their studies, they identified an interesting range of cases where the small fraction of nonzero means is so small that the alternative hypothesis exhibits little noticeable effect on the distribution of the p-values either for the bulk of the tests or for the few most highly significant tests. In this range, when the amplitude of nonzero means is calibrated with the fraction of nonzero means, the likelihood ratio test for a precisely specified alternative would still succeed in separating the two hypotheses.

We show that the higher criticism is successful throughout the same region of amplitude sparsity where the likelihood ratio test would succeed. Since it does not require a specification of the alternative, this shows that higher criticism is in a sense optimally adaptive to unknown sparsity and size of the nonnull effects. While our theoretical work is largely asymptotic, we provide simulations in finite samples and suggest some possible applications. We also show that higher critcism works well over a range of non-Gaussian cases.

The subtlety of this “sparse normal means” testing problem can be seen from work of Y. I. Ingster [Math. Methods. Stat. 6, 47–69 (1997; Zbl 0878.62005)] and J. Jin [Detection boundary for sparse mixtures. Unpubl. manuscript. (2002)], who studied such problems in great detail. In their studies, they identified an interesting range of cases where the small fraction of nonzero means is so small that the alternative hypothesis exhibits little noticeable effect on the distribution of the p-values either for the bulk of the tests or for the few most highly significant tests. In this range, when the amplitude of nonzero means is calibrated with the fraction of nonzero means, the likelihood ratio test for a precisely specified alternative would still succeed in separating the two hypotheses.

We show that the higher criticism is successful throughout the same region of amplitude sparsity where the likelihood ratio test would succeed. Since it does not require a specification of the alternative, this shows that higher criticism is in a sense optimally adaptive to unknown sparsity and size of the nonnull effects. While our theoretical work is largely asymptotic, we provide simulations in finite samples and suggest some possible applications. We also show that higher critcism works well over a range of non-Gaussian cases.

### MSC:

62G10 | Nonparametric hypothesis testing |

62J15 | Paired and multiple comparisons; multiple testing |

62G30 | Order statistics; empirical distribution functions |

62G20 | Asymptotic properties of nonparametric inference |

62G32 | Statistics of extreme values; tail inference |

### Keywords:

multiple comparsions; combining many p-values; sparse normal means; thresholding; normalized empirical process### Citations:

Zbl 0878.62005
PDF
BibTeX
XML
Cite

\textit{D. Donoho} and \textit{J. Jin}, Ann. Stat. 32, No. 3, 962--994 (2004; Zbl 1092.62051)

### References:

[1] | Abramovich, F., Benjamini, Y., Donoho, D. and Johnstone, I. (2000). Adapting to unknown sparsity by controlling the false discovery rate. Technical report 2000-19, Dept. Statistics, Stanford Univ. · Zbl 1092.62005 |

[2] | Anderson, T. W. and Darling, D. A. (1952). Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann. Math. Statist. 23 193–212. · Zbl 0048.11301 |

[3] | Becker, B. J. (1994). Combining significance levels. In The Handbook of Research Synthesis (H. Cooper and L. Hedges, eds.) Chap. 15. Russell Sage Foundation, New York. |

[4] | Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300. · Zbl 0809.62014 |

[5] | Berk, R. H. and Jones, D. H. (1979). Goodness-of-fit test statistics that dominate the Kolmogorov statistic. Z. Wahrsch. Verw. Gebiete 47 47–59. · Zbl 0379.62026 |

[6] | Bickel, P. J. and Chernoff, H. (1993). Asymptotic distribution of the likelihood ratio statistic in a prototypical nonregular problem. In Statistics and Probability : A Raghu Raj Bahadur Festschrift (J. K. Ghosh, S. K. Mitra, K. R. Parthasarathy and B. L. S. Prakasa Rao, eds.) 83–96. Wiley Eastern, New Delhi. |

[7] | Borovkov, A. A. and Sycheva, N. M. (1968). On some asymptotically optimal nonparametric tests. Teor. Verojatnost. i Primenen. 13 385–418. · Zbl 0165.21204 |

[8] | Borovkov, A. A. and Sycheva, N. M. (1970). On asymptotically optimal nonparametric criteria. In Nonparametric Techniques in Statistical Inference (M. L. Puri, ed.) 259–266. Cambridge Univ. Press. |

[9] | Box, G. E. P. and Tiao, G. C. (1973). Bayesian Inference in Statistical Analysis . Addison–Wesley, Reading, MA. · Zbl 0271.62044 |

[10] | Brožek, J. and Tiede, K. (1952). Reliable and questionable significance in a series of statistical tests. Psychological Bull. 49 339–341. |

[11] | Darling, D. A. and Erdös, P. (1956). A limit theorem for the maximum of normalized sums of independent random variables. Duke Math. J. 23 143–155. · Zbl 0070.13806 |

[12] | Eicker, F. (1979). The asymptotic distribution of the suprema of the standardized empirical processes. Ann. Statist. 7 116–138. JSTOR: · Zbl 0398.62014 |

[13] | Fisher, R. A. (1932). Statistical Methods for Research Workers , 4th ed. Oliver and Boyd, Edinburg. · JFM 58.1161.04 |

[14] | Hartigan, J. A. (1985). A failure of likelihood asymptotics for normal mixtures. In Proc. Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer (L. M. Le Cam and R. A. Olshen, eds.) 2 807–810. Wadsworth, Monterey, CA. · Zbl 1373.62070 |

[15] | Hochberg, Y. and Tamhane, A. C. (1987). Multiple Comparison Procedures . Wiley, New York. · Zbl 0731.62125 |

[16] | Ingster, Y. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distribution. Math. Methods Statist. 6 47–69. · Zbl 0878.62005 |

[17] | Ingster, Y. I. (1999). Minimax detection of a signal for \(l^p_n\)-balls. Math. Methods Statist. 7 401–428. · Zbl 1103.62312 |

[18] | Ingster, Y. I. (2002). Adaptive detection of a signal of growing dimension, I, II. Math. Methods Statist. 10 395–421; 11 37–68. · Zbl 1005.62051 |

[19] | Ingster, Y. I. and Lepski, O. (2002). On multichannel signal detection. |

[20] | Ingster, Y. I. and Suslina, I. A. (2000). Minimax nonparametric hypothesis testing for ellipsoids and Besov bodies. ESAIM Probab. Statist. (electronic) 4 53–135. · Zbl 1110.62321 |

[21] | Ingster, Y. I. and Suslina, I. A. (2004). On multichannel detection of a signal of known shape. Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov ( POMI ) . · Zbl 1259.94029 |

[22] | Jaeschke, D. (1979). The asymptotic distribution of the supremum of the standardized empirical distribution function on subintervals. Ann. Statist. 7 108–115. JSTOR: · Zbl 0398.62013 |

[23] | Jin, J. (2002). Detection boundary for sparse mixtures. Unpublished manuscript. |

[24] | Johnson, N. L., Kotz, S. and Balakrishan, N. (1995). Continuous Univariate Distribution , 2nd ed. 2 . Wiley, New York. |

[25] | Kendall, D. G. and Kendall, W. S. (1980). Alignments in two-dimensional random sets of points. Adv. in Appl. Probab. 12 380–424. · Zbl 0425.60009 |

[26] | Miller, R. G., Jr. (1966). Simultaneous Statistical Inference . McGraw–Hill, New York. · Zbl 0192.25702 |

[27] | Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics . Wiley, New York. · Zbl 1170.62365 |

[28] | Simoncelli, E. P. (1999). Modeling the joint statistics of images in the wavelet domain. In Proc. SPIE 3813 188–195. SPIE—The International Society for Optical Engineering, Bellingham, WA. |

[29] | Subbotin, M. T. (1923). On the law of frequency of errors. Mat. Sb. 31 296–301. · JFM 49.0370.01 |

[30] | Tukey, J. W. (1965). Which part of the sample contains the information? Proc. Natl. Acad. Sci. U.S.A. 153 127–134. · Zbl 0168.40205 |

[31] | Tukey, J. W. (1976). T13 N: The higher criticism. Course Notes, Statistics 411, Princeton Univ. |

[32] | Tukey, J. W. (1989). Higher criticism for individual significances in several tables or parts of tables. Working Paper, Princeton Univ. |

[33] | Tukey, J. W. (1953). The problem of multiple comparisons. In The Collected Works of John W. Tukey VIII. Multiple Comparisons : 1948–1983 (H. I. Braun, ed.) 1–300. Chapman and Hall, New York. |

[34] | Wellner, J. A. (1978). Limit theorems for the ratio of the empirical distribution function to the true distribution function. Z. Wahrsch. Verw. Gebiete 45 73–88. · Zbl 0382.60031 |

[35] | Wellner, J. A. and Koltchinskii, V. (2004). A note on the asymptotic distribution of Berk–Jones type statistics under the null hypothesis. In High Dimensional Probability III (T. Hoffmann-Jørgensen, M. B. Marcus and J. A. Wellner, eds.) 321–332. Birhäuser, Basel. · Zbl 1042.62009 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.