zbMATH — the first resource for mathematics

Detection of an anomalous cluster in a network. (English) Zbl 1209.62097
Summary: We consider the problem of detecting whether or not, in a given sensor network, there is a cluster of sensors which exhibit an “unusual behavior”. Formally, suppose we are given a set of nodes and attach a random variable to each node. We observe a realization of this process and want to decide between the following two hypotheses: under the null, the variables are i.i.d. standard normal; under the alternative, there is a cluster of variables that are i.i.d. normal with positive mean and unit variance, while the rest are i.i.d. standard normal. We also address surveillance settings where each sensor in the network collects information over time. The resulting model is similar, now with a time series attached to each node. We again observe the process over time and want to decide between the null, where all the variables are i.i.d. standard normal, and the alternative, where there is an emerging cluster of i.i.d. normal variables with positive mean and unit variance. The growth models used to represent the emerging cluster are quite general and, in particular, include cellular automata used in modeling epidemics. In both settings, we consider classes of clusters that are quite general, for which we obtain a lower bound on their respective minimax detection rate and show that some form of scan statistic, by far the most popular method in practice, achieves that same rate to within a logarithmic factor. Our results are not limited to the normal location model, but generalize to any one-parameter exponential family when the anomalous clusters are large enough.

62G10 Nonparametric hypothesis testing
62C20 Minimax procedures in statistical decision theory
05C90 Applications of graph theory
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
82B20 Lattice systems (Ising, dimer, Potts, etc.) and systems on graphs arising in equilibrium statistical mechanics
Full Text: DOI arXiv
[1] Addario-Berry, L., Broutin, N., Devroye, L. and Lugosi, G. (2010). On combinatorial testing problems. Ann. Statist. 38 3063-3092. · Zbl 1200.62059
[2] Agur, S., Diekmann, O., Heesterbeek, H., Cushing, J., Gyllenberg, M., Kimmel, M., Milner, F., Jagers, P. and Kostova, T., eds. (1999). Epidemiology, Cellular Automata and Evolution . Elsevier, Oxford. · Zbl 0932.00079
[3] Akyildiz, I., Su, W., Sankarasubramaniam, Y. and Cayirci, E. (2002). A survey on sensor networks. IEEE Communications Magazine 40 102-114.
[4] Aldosari, S. and Moura, J. (2004). Detection in decentralized sensor networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 2004 (ICASSP’04) 2 277-280.
[5] Arias-Castro, E., Candès, E. J. and Durand, A. (2009). Detection of an abnormal cluster in a network. In Proc. 57th Session of the International Statistical Institute, Durban, South Africa . · Zbl 1209.62097
[6] Arias-Castro, E., Candès, E. J. and Durand, A. (2010). Supplement to “Detection of an anomalous cluster in a network.” DOI: . · Zbl 1209.62097
[7] Arias-Castro, E., Candès, E. J., Helgason, H. and Zeitouni, O. (2008). Searching for a trail of evidence in a maze. Ann. Statist. 36 1726-1757. · Zbl 1143.62006
[8] Arias-Castro, E., Donoho, D. and Huo, X. (2005). Near-optimal detection of geometric objects by fast multiscale methods. IEEE Trans. Inform. Theory 51 2402-2425. · Zbl 1282.94014
[9] Arias-Castro, E., Donoho, D. and Huo, X. (2006). Adaptive multiscale detection of filamentary structures in a background of uniform random points. Ann. Statist. 34 326-349. · Zbl 1091.62095
[10] Arias-Castro, E., Efros, B. and Levi, O. (2010). Networks of polynomial pieces with application to the analysis of point clouds and images. J. Approx. Theory 162 94-130. · Zbl 1190.42015
[11] Arora, A., Dutta, P., Bapat, S., Kulathumani, V., Zhang, H., Naik, V., Mittal, V., Cao, H., Demirbas, M., Gouda, M., Choi, Y., Herman, T., Kulkarni, S., Arumugam, U., Nesterenko, M., Vora, A. and Miyashita, M. (2004). A line in the sand: A wireless sensor network for target detection, classification and tracking. Comput. Networks 46 605-634.
[12] Bohman, T. and Gravner, J. (1999). Random threshold growth dynamics. Random Structures Algorithms 15 93-111. · Zbl 0932.60086
[13] Boutsikas, M. V. and Koutras, M. V. (2006). On the asymptotic distribution of the discrete scan statistic. J. Appl. Probab. 43 1137-1154. · Zbl 1130.62017
[14] Braams, J., Pruim, J., Freling, N., Nikkels, P., Roodenburg, J., Boering, G., Vaalburg, W. and Vermey, A. (1995). Detection of lymph node metastases of squamous-cell cancer of the head and neck with FDG-PET and MRI. Journal of Nuclear Medicine 36 211.
[15] Brennan, S. M., Mielke, A. M., Torney, D. C. and Maccabe, A. B. (2004). Radiation detection with distributed sensor networks. IEEE Computer 37 57-59.
[16] Brodsky, B. and Darkhovsky, B. (1993). Nonparametric Methods in Change-Point Problems. Mathematics and Its Applications 243 . Kluwer Academic, Dordrecht. · Zbl 1274.62512
[17] Candès, E. J., Charlton, P. R. and Helgason, H. (2008). Detecting highly oscillatory signals by chirplet path pursuit. Appl. Comput. Harmon. Anal. 24 14-40. · Zbl 1144.94003
[18] Caron, Y., Makris, P. and Vincent, N. (2002). A method for detecting artificial objects in natural environments. In Proceedings 16th International Conference on Pattern Recognition 1 10600. IEEE Comput. Soc., New York.
[19] Cui, Y., Wei, Q., Park, H. and Lieber, C. (2001). Nanowire nanosensors for highly sensitive and selective detection of biological and chemical species. Science 293 1289-1292.
[20] Culler, D., Estrin, D. and Srivastava, M. (2004). Overview of sensor networks. Computer 37 41-49.
[21] DasGupta, B., Hespanha, J. P., Riehl, J. and Sontag, E. (2006). Honey-pot constrained searching with local sensory information. Nonlinear Anal. 65 1773-1793. · Zbl 1138.90402
[22] Dembo, A., Gandolfi, A. and Kesten, H. (2001). Greedy lattice animals: Negative values and unconstrained maxima. Ann. Probab. 29 205-241. · Zbl 1016.60048
[23] Demirbaş, K. (1987). Maneuvering target tracking with hypothesis testing. IEEE Trans. Aerospace Electron. Systems 23 757-766.
[24] Desolneux, A., Moisan, L. and Morel, J.-M. (2003). Maximal meaningful events and applications to image analysis. Ann. Statist. 31 1822-1851. · Zbl 1046.62104
[25] Duczmal, L., Kulldorff, M. and Huang, L. (2006). Evaluation of spatial scan statistics for irregularly shaped clusters. J. Comput. Graph. Statist. 15 428-442.
[26] Dudley, R. M. (1967). The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. J. Funct. Anal. 1 290-330. · Zbl 0188.20502
[27] Fitch, J., Raber, E. and Imbro, D. (2003). Technology challenges in responding to biological or chemical attacks in the civilian sector. Science 302 1350-1354.
[28] Geelhood, B., Ely, J., Hansen, R., Kouzes, R., Schweppe, J. and Warner, R. (2003). Overview of portal monitoring at border crossings. 2003 IEEE Nuclear Science Symposium Conference Record 1 513-517.
[29] Geman, D. and Jedynak, B. (1996). An active testing model for tracking roads in satellite images. IEEE Trans. Pattern Anal. Mach. Intell. 18 1-14.
[30] Glaz, J., Naus, J. and Wallenstein, S. (2001). Scan Statistics . Springer, New York. · Zbl 0983.62075
[31] Gravner, J. and Griffeath, D. (2006). Random growth models with polygonal shapes. Ann. Probab. 34 181-218. · Zbl 1090.60077
[32] Hall, P. and Jin, J. (2008). Properties of higher criticism under strong dependence. Ann. Statist. 36 381-402. · Zbl 1139.62049
[33] Hall, P. and Jin, J. (2010). Innovated higher criticism for detecting sparse signals in correlated noise. Ann. Statist. 38 1686-1732. · Zbl 1189.62080
[34] Heffernan, R., Mostashari, F., Das, D., Karpati, A., Kulldorff, M. and Weiss, D. (2004). Syndromic surveillance in public health practice, New York City. Emerging Infectious Diseases 10 858-864.
[35] Hills, R. (2001). Sensing for danger. Sci. Technol. Rev. Available at .
[36] Husby, O. and Rue, H. (2004). Estimating blood vessel areas in ultrasound images using a deformable template model. Stat. Model. 4 211-226. · Zbl 1117.62493
[37] Ilachinski, A. (2001). Cellular Automata: A Discrete Universe . World Scientific, River Edge, NJ. · Zbl 1031.68081
[38] Jain, A., Zhong, Y. and Dubuisson-Jolly, M. (1998). Deformable template models: A review. Signal Processing 71 109-129. · Zbl 1053.68771
[39] James, D., Clymer, B. D. and Schmalbrock, P. (2001). Texture detection of simulated microcalcification susceptibility effects in magnetic resonance imaging of breasts. Journal of Magnetic Resonance Imaging 13 876-881.
[40] Jiang, T. (2002). Maxima of partial sums indexed by geometrical structures. Ann. Probab. 30 1854-1892. · Zbl 1014.60024
[41] Klarner, D. (1967). Cell growth problems. Canad. J. Math. 19 851-863. · Zbl 0178.00904
[42] Kolmogorov, A. N. and Tihomirov, V. M. (1961). \varepsilon -entropy and \varepsilon -capacity of sets in functional space. Amer. Math. Soc. Transl. (2) 17 277-364. · Zbl 0133.06703
[43] Kulldorff, M. (1997). A spatial scan statistic. Comm. Statist. Theory Methods 26 1481-1496. · Zbl 0920.62116
[44] Kulldorff, M. (2001). Prospective time periodic geographical disease surveillance using a scan statistic. J. Roy. Statist. Soc. Ser. A 164 61-72. JSTOR: · Zbl 1002.62517
[45] Kulldorff, M., Fang, Z. and Walsh, S. J. (2003). A tree-based scan statistic for database disease surveillance. Biometrics 59 323-331. JSTOR: · Zbl 1210.62177
[46] Kulldorff, M., Heffernan, R., Hartman, J., Assuncao, R. and Mostashari, F. (2005). A space-time permutation scan statistic for disease outbreak detection. PLOS Medicine 2 216.
[47] Kulldorff, M., Huang, L., Pickle, L. and Duczmal, L. (2006). An elliptic spatial scan statistic. Stat. Med. 25 3929-43.
[48] Lawler, G., Bramson, M. and Griffeath, D. (1992). Internal diffusion limited aggregation. Ann. Probab. 20 2117-2140. · Zbl 0762.60096
[49] Lexa, M., Rozell, C., Sinanovic, S. and Johnson, D. (2004). To cooperate or not to cooperate: Detection strategies in sensor networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 2004 (ICASSP’04) 3 841-844.
[50] Li, D., Wong, K., Hu, Y. H. and Sayeed, A. (2002). Detection, classification and tracking of targets. IEEE Signal Processing Magazine 19 17-29.
[51] McInerney, T. and Terzopoulos, D. (1996). Deformable models in medical image analysis: A survey. Medical Image Analysis 1 91-108.
[52] Mei, Y. (2008). Asymptotic optimality theory for decentralized sequential hypothesis testing in sensor networks. IEEE Trans. Inform. Theory 54 2072-2089. · Zbl 1332.94026
[53] Moon, N., Bullitt, E., van Leemput, K. and Gerig, G. (2002). Automatic brain and tumor segmentation. In MICCAI’02: Proceedings of the 5th International Conference on Medical Image Computing and Computer-Assisted Intervention-Part I 372-379. Springer, London. · Zbl 1028.68847
[54] Patil, G. P., Balbus, J., Biging, G., Jaja, J., Myers, W. L. and Taillie, C. (2004). Multiscale advanced raster map analysis system: Definition, design and development. Environ. Ecol. Stat. 11 113-138.
[55] Patwari, N. and Hero, A. (2003). Hierarchical censoring for distributed detection in wireless sensor networks. In Proceedings of 2003 IEEE International Conference on Acoustics, Speech and Signal Processing 2003 (ICASSP’03) 4 848-851.
[56] Penrose, M. (2003). Random Geometric Graphs. Oxford Studies in Probability 5 . Oxford Univ. Press, Oxford. · Zbl 1029.60007
[57] Perone Pacifico, M., Genovese, C., Verdinelli, I. and Wasserman, L. (2004). False discovery control for random fields. J. Amer. Statist. Assoc. 99 1002-1014. · Zbl 1055.62105
[58] Pozo, D., Olmo, F. and Alados-Arboledas, L. (1997). Fire detection and growth monitoring using a multitemporal technique on AVHRR mid-infrared and thermal channels. Remote Sensing of Environment 60 111-120.
[59] Richardson, D. (1973). Random growth in a tessellation. Proc. Cambridge Philos. Soc. 74 515-528. · Zbl 0295.62094
[60] Rotz, L. and Hughes, J. (2004). Advances in detecting and responding to threats from bioterrorism and emerging infectious disease. Nature Medicine S130-S136.
[61] Schiff, J. L. (2008). Cellular Automata . Wiley, Hoboken, NJ. · Zbl 1142.68052
[62] Şendur, L., Maxim, V., Whitcher, B. and Bullmore, E. (2005). Multiple hypothesis mapping of functional MRI data in orthogonal and complex wavelet domains. IEEE Trans. Signal Process. 53 3413-3426. · Zbl 1370.94058
[63] Shen, X., Huang, H.-C. and Cressie, N. (2002). Nonparametric hypothesis testing for a spatial signal. J. Amer. Statist. Assoc. 97 1122-1140. JSTOR: · Zbl 1041.62038
[64] Siegmund, D. (1985). Sequential Analysis: Tests and Confidence Intervals . Springer, New York. · Zbl 0573.62071
[65] Strickland, R. and Hahn, H. Wavelet transform methods for object detection and recovery. IEEE Trans. Image Process. 6 724-735.
[66] Szor, P. (2005). The Art of Computer Virus Research and Defense . Addison-Wesley Professional.
[67] Talagrand, M. (2005). The Generic Chaining . Springer, Berlin. · Zbl 1075.60001
[68] Tan, H. and Zhang, Y. (2006). An energy minimization process for extracting eye feature based on deformable template. Lecture Notes in Computer Science 3852 663. Springer, Berlin.
[69] Thomopoulos, S., Viswanathan, R. and Bougoulias, D. (1989). Optimal distributed decision fusion. IEEE Transactions on Aerospace and Electronic Systems 25 761-765.
[70] Wagner, M., Tsui, F., Espino, J., Dato, V., Sittig, D., Caruana, R., Mcginnis, L., Deerfield, D., Druzdzel, M. and Fridsma, D. (2001). The emerging science of very early detection of disease outbreaks. Journal of Public Health Management and Practice 7 51-59.
[71] Walther, G. (2010). Optimal and fast detection of spatial clusters with scan statistics. Ann. Statist. 38 1010-1033. · Zbl 1183.62076
[72] Xu, C. and Prince, J. (1998). Snakes, shapes and gradient vector flow. IEEE Trans. Image Process. 7 359-369. · Zbl 0973.94003
[73] Yu, L., Yuan, L., Qu, G. and Ephremides, A. (2006). Energy-driven detection scheme with guaranteed accuracy. Processing of the Fifth International Conference on Information in Sensor Networks 2006 (IPSN’2006) 284-291.
[74] Zhao, F. and Guibas, L. (2004). Wireless Sensor Networks: An Information Processing Approach . Morgan Kaufmann, San Francisco.
[75] Zhong, Y., Jain, A. and Dubuisson-Jolly, M.-P. (2000). Object tracking using deformable templates. IEEE Trans. Pattern Anal. Mach. Intell. 2 544-549.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.