zbMATH — the first resource for mathematics

Detection of spatial clustering with average likelihood ratio test statistics. (English) Zbl 1184.62067
Summary: Generalized likelihood ratio (GLR) test statistics are often used in the detection of spatial clustering in case-control and case-population data sets to check for a significantly large proportion of cases within some scanning window. The traditional spatial scan test statistic takes the supremum GLR value over all windows, whereas the average likelihood ratio (ALR) test statistic that we consider here takes an average of the GLR values. Numerical experiments in the literature and in this paper show that the ALR test statistic has more power compared to the spatial scan statistic. We develop accurate tail probability approximations of the ALR test statistic that allow us to by-pass computer intensive Monte Carlo procedures to estimate \(p\)-values. In models that adjust for covariates, these Monte Carlo evaluations require an initial fitting of parameters that can result in very biased \(p\)-value estimates.

62G10 Nonparametric hypothesis testing
65C05 Monte Carlo methods
60F10 Large deviations
65C60 Computational problems in statistics (MSC2010)
Full Text: DOI arXiv
[1] Anderson, N. H. and Titterington, D. M. (1997). Some methods for investigating spatial clustering, with epidemiological applications. J. Roy. Statist. Soc. Ser. A 160 87-105.
[2] Begun, J. M., Hall, W. J., Huang, W. M. and Wellner, J. A. (1983). Information and asymptotic efficiency in parametric-nonparametric models. Ann. Statist. 11 432-452. · Zbl 0526.62045
[3] Chan, H. P. and Tu, I. (2009). P-value computations for cluster detection with covariate adjustments. Technical report, National Univ. Singapore.
[4] Chan, H. P. and Zhang, N. R. (2009). Local average likelihood ratio test statistics with applications in genomics and change-point detection. Technical report, National Univ. Singapore.
[5] Cressie, N. (1993). Statistics for Spatial Data . Wiley, New York. · Zbl 0799.62002
[6] Cuzick, J. and Edwards, R. (1990). Spatial clustering for inhomogeneous populations (with discussions). J. Roy. Statist. Soc. Ser. B 52 73-104. JSTOR: · Zbl 0703.62069
[7] Diggle, P. J. and Chetwynd, A. G. (1991). Second-order analysis of spatial clustering for inhomogeneous populations. Biometrics 47 1155-1163.
[8] Diggle, P. J., Gatrell, A. C. and Lovett, A. A. (1990). Modelling the prevalence of cancer of the larynx in part of Lanchashire: A new methodology for spatial epidemiology. In Spatial Epidemiology . Pion, London.
[9] Diggle, P. J. and Marron, J. S. (1988). Equivalence of smoothing parameter selectors in density and intensity estimation. J. Amer. Statist. Assoc. 83 793-800. JSTOR: · Zbl 0662.62036
[10] Dwass, M. (1957). Modified randomization tests for nonparametric hypotheses. Ann. Math. Statist. 28 181-187. · Zbl 0088.35301
[11] Edgington, E. S. (1995). Randomization Tests , 3rd ed. Marcel Dekker, New York. · Zbl 0893.62036
[12] Gangnon, R. and Clayton, M. (2001). A weighted average likelihood ratio test for spatial clustering of disease. Stat. Med. 20 2977-2987.
[13] Haining, R. (2003). Spatial Data Analysis: Theory and Practice . Cambridge Univ. Press, Cambridge.
[14] Karatzas, I. and Shreve, S. (1991). Brownian Motion and Stochastic Calculus . Springer, New York. · Zbl 0734.60060
[15] Kulldorff, M. (1997). A spatial scan statistic. Comm. Statist. Theory Methods 26 1481-1496. · Zbl 0920.62116
[16] Kulldorff, M. and Information Management Services Inc. (2009). SaTScan user guide. Available at http://www.satscan.org/techdoc.html.
[17] Kulldorff, M. and Nagarwalla, N. (1995). Spatial disease clusters: Detection and inference. Stat. Med. 14 799-810.
[18] Kulldorff, M., Tango, T. and Park, P. (2003). Power comparisons for disease clustering tests. Comput. Statist. Data Anal. 42 665-684. · Zbl 1429.62558
[19] Lai, T. L. and Siegmund, D. (1977). A nonlinear renewal theory with applications to sequential analysis I. Ann. Statist. 5 946-954. · Zbl 0378.62069
[20] Lai, T. L. and Siegmund, D. (1979). A nonlinear renewal theory with applications to sequential analysis II. Ann. Statist. 7 60-76. · Zbl 0409.62074
[21] Loader, C. (1991). Large-deviation approximation to the distribution of scan statistics. Adv. in Appl. Probab. 23 751-771. JSTOR: · Zbl 0741.60036
[22] Murphy, S. and van der Vaart, A. W. (2000). On profile likelihood. J. Amer. Statist. Assoc. 95 449-465. JSTOR: · Zbl 0995.62033
[23] Naus, J. I. (1965). Clustering of random points in two dimensions. Biometrika 52 263-267. JSTOR: · Zbl 0132.39702
[24] Neill, D., Moore, A. W. and Cooper, G. (2006). A Bayesian spatial scan statistic. In Advances in Neural Information Processing Systems (Y. Weiss, B. Scholkopf, J. Platt, eds.) 18 1003-1010. MIT Press, Boston, MA.
[25] Patil, G. P. and Taillie, C. (2004). Upper level set scan statistic for detecting arbitrarily shaped hot-spots. Environ. Ecol. Stat. 11 183-197.
[26] Rabinowitz, D. (1994). Detecting Clusters in Disease Incidence. IMS Lecture Notes-Monograph Series 23 255-275. IMS, Hayward, CA. · Zbl 1158.60352
[27] Rabinowitz, D. and Siegmund, D. (1997). The approximate distribution of the maximum of a smoothed Poisson random field. Statist. Sinica 7 167-180. · Zbl 0895.60053
[28] Siegmund, D. (2001). Is peak height sufficient? Genetic Epidemiology 20 403-408.
[29] Stoyan, D. and Penttinen, A. (2000). Recent applications of point process methods in forestry studies. Statist. Sci. 15 16-78.
[30] Tango, T. and Takahashi, K. (2005). A flexibly shaped spatial scan statistic for detecting clusters. J. Internat. Health Geographics 4 4-11.
[31] Waller, L. A. and Gotway, C. A. (2004). Applied Spatial Statistics for Public Health Data . Wiley, New York. · Zbl 1057.62106
[32] Woodroofe, M. (1978). Large deviations of the likelihood ratio statistics with applications to sequential testing. Ann. Statist. 6 72-84. · Zbl 0386.62019
[33] Woodroofe, M. (1982). Nonlinear Renewal Theory in Sequential Analysis . SIAM, Philadelphia, PA. · Zbl 0487.62062
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.