Tree models for difference and change detection in a complex environment. (English) Zbl 1254.62068

Summary: A new family of tree models is proposed, which we call “differential trees.” A differential tree model is constructed from multiple data sets and aims to detect distributional differences between them. The new methodology differs from the existing difference and change detection techniques in its nonparametric nature, model construction from multiple data sets, and applicability to high-dimensional data. Through a detailed study of an arson case in New Zealand, where an individual is known to have been laying vegetation fires within a certain time period, we illustrate how these models can help detect changes in the frequencies of event occurrences and uncover unusual clusters of events in a complex environment.


62G99 Nonparametric inference
62P99 Applications of statistics
65C60 Computational problems in statistics (MSC2010)
62L99 Sequential statistical methods


R; rpart; AdaBoost.MH
Full Text: DOI arXiv Euclid


[1] Basseville, M. and Nikiforov, I. V. (1993). Detection of Abrupt Changes : Theory and Application . Prentice Hall, Englewood Cliffs, NJ.
[2] Breiman, L. (1996a). Bagging predictors. Machine Learning 24 123-140. · Zbl 0858.68080
[3] Breiman, L. (1996b). Heuristics of instability and stabilization in model selection. Ann. Statist. 24 2350-2383. · Zbl 0867.62055 · doi:10.1214/aos/1032181158
[4] Breiman, L. (2001). Random forests. Machine Learning 45 5-32. · Zbl 1007.68152 · doi:10.1023/A:1010933404324
[5] Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees . Wadsworth, Belmont, CA. · Zbl 0541.62042
[6] Chaudhuri, P., Lo, W. D., Loh, W.-Y. and Yang, C. C. (1995). Generalized regression trees. Statist. Sinica 5 641-666. · Zbl 0824.62060
[7] Davis, R. B. and Anderson, J. R. (1989). Exponential survival trees. Stat. Med. 8 947-961.
[8] Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci. 55 119-139. · Zbl 0880.68103 · doi:10.1006/jcss.1997.1504
[9] Glaz, J., Naus, J. and Wallenstein, S. (2001). Scan Statistics . Springer, New York. · Zbl 0983.62075
[10] Gustafsson, F. (2000). Adaptive Filtering and Change Detection . Wiley, Chichester, UK.
[11] Ishwaran, H., Kogalur, U. B., Blackstone, E. H. and Lauer, M. S. (2008). Random survival forests. Ann. Appl. Stat. 2 841-860. · Zbl 1149.62331 · doi:10.1214/08-AOAS169
[12] Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. J. Appl. Stat. 29 119-127.
[13] Lai, T. L. (1995). Sequential changepoint detection in quality control and dynamical systems. J. Roy. Statist. Soc. Ser. B 57 613-658. · Zbl 0832.62072
[14] MacEachern, S. N., Rao, Y. and Wu, C. (2007). A robust-likelihood cumulative sum chart. J. Amer. Statist. Assoc. 102 1440-1447. · Zbl 1333.62306 · doi:10.1198/016214507000001102
[15] Morgan, J. N. and Sonquist, J. A. (1963). Problems in the analysis of survey data, and a proposal. J. Amer. Statist. Assoc. 58 415-434. · Zbl 0114.10103 · doi:10.2307/2283276
[16] Naus, J. I. (1965). The distribution of the size of the maximum cluster of points on a line. J. Amer. Statist. Assoc. 60 532-538. · doi:10.1080/01621459.1965.10480810
[17] Page, E. S. (1954). Continuous inspection schemes. Biometrika 41 100-115. · Zbl 0056.38002 · doi:10.1093/biomet/41.1-2.100
[18] Poor, H. V. and Hadjiliadis, O. (2009). Quickest Detection . Cambridge Univ. Press, Cambridge. · Zbl 1271.62015
[19] Quinlan, J. R. (1993). C 4 . 5: Programs for Machine Learning . Morgan Kaufmann, San Mateo, CA. · Zbl 1037.68938
[20] R Development Core Team (2011). R : A Language and Environment for Statistical Computing . R Foundation for Statistical Computing, Vienna, Austria.
[21] Shewhart, W. A. (1931). Economic Control of Manufactured Products . Van Nostrand-Reinhold, New York.
[22] Su, X., Wang, M. and Fan, J. (2004). Maximum likelihood regression trees. J. Comput. Graph. Statist. 13 586-598. · doi:10.1198/106186004X2165
[23] Therneau, T. M. and Atkinson, E. J. (1997). An introduction to recursive partitioning using the rpart routine. Technical Report 61, Section of Biostatistics, Mayo Clinic, Rochester, NY.
[24] Wang, Y., Ziedins, I., Holmes, M. and Challands, N. (2012). Supplement to “Tree models for difference and change detection in a complex environment”. , DOI:10.1214/12-AOAS548SUPPB . · Zbl 1254.62068
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.