×

Spatial CART classification trees. (English) Zbl 1505.62049

Summary: We propose to extend CART for bivariate marked point processes to provide a segmentation of the space into homogeneous areas for interaction between marks. While usual CART tree considers marginal distribution of the response variable at each node, the proposed algorithm, SpatCART, takes into account the spatial location of the observations in the splitting criterion. We introduce a dissimilarity index based on Ripley’s intertype \(K\)-function quantifying the interaction between two populations. This index used for the growing step of the CART strategy, leads to a heterogeneity function consistent with the original CART algorithm. Therefore the new variant is a way to explore spatial data as a bivariate marked point process using binary classification trees. The proposed procedure is implemented in an R package, and illustrated on simulated examples. SpatCART is finally applied to a tropical forest example.

MSC:

62-08 Computational methods for problems pertaining to statistics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
60G55 Point processes (e.g., Poisson, Cox, Hawkes processes)
62M30 Inference from spatial processes
PDFBibTeX XMLCite
Full Text: DOI HAL

References:

[1] Anselin L, Getis A (2010) Spatial statistical analysis and geographic information systems, in Perspectives on spatial data analysis, vol 35-47. Springer, Berlin
[2] Arlot, S., Minimal penalty and the slope heuristic: a survey (with discussion), Journal de la Société Française de Statistique, 160, 3, 1-106 (2019) · Zbl 1437.62121
[3] Baddeley, A.; Moller, J.; Waagepetersen, R., Non- and semiparametric estimation of interaction in inhomogeneous point patterns, Stat Neerl, 54, 329-350 (2000) · Zbl 1018.62027 · doi:10.1111/1467-9574.00144
[4] Baudry, JP; Maugis, C.; Michel, B., Slope heuristics: overview and implementation, Stat Comput, 22, 2, 455-470 (2012) · Zbl 1322.62007 · doi:10.1007/s11222-011-9236-1
[5] Bar-Hen, A.; Picard, N., Simulation study of dissimilarity between point process, Comput Stat, 21, 3-4, 487-507 (2006) · Zbl 1164.62437 · doi:10.1007/s00180-006-0008-x
[6] Bel, L.; Allard, D.; Laurent, JM; Cheddadi, R.; Bar-Hen, A., CART algorithm for spatial data: application to environmental and ecological data, Comput Stat Data Anal, 53, 8, 3082-3093 (2009) · Zbl 1453.62042 · doi:10.1016/j.csda.2008.09.012
[7] Breiman, L.; Friedman, JH; Olshen, RA; Stone, CJ, Classification and regression trees (1984), London: Chapman & Hall, London · Zbl 0541.62042
[8] Chipman, HA; George, E.; Laurent, JM; McCulloch, RE, BART: Bayesian additive regression trees, Ann Appl Stat, 4, 1, 266-298 (2010) · Zbl 1189.62066 · doi:10.1214/09-AOAS285
[9] Cressie, N., Statistics for spatial data (1991), New York: Wiley, New York · Zbl 0799.62002
[10] Diggle, PJ; Chetwynd, AG, Second-order analysis of spatial clustering for inhomogeneous populations, Biometrics, 47, 1155-1163 (1991) · doi:10.2307/2532668
[11] Diggle, PJ; Milne, RK, Bivariate Cox processes: some models for bivariate spatial point patterns, J R Stat Soc B, 45, 11-21 (1983) · Zbl 0503.62086
[12] Favrichon, V., Classification des espèces arborées en groupes fonctionnels en vue de la réalisation d’un modèle de dynamique de peuplement en forêt guyanaise, Rev Ecol, 49, 379-403 (1994)
[13] Gey S, Lebarbier E (2008) Using CART to detect multiple change points in the mean. Preprint in Statistics and System Biology 12, HAL 00327146
[14] Gourlet-Fleury, S.; Guehl, JM; Laroussinie, O., Ecology and management of a neotropical rainforest: lessons drawn from Paracou, a long-term experimental research site in French Guiana (2004), Paris: Elsevier, Paris
[15] Haining, R., Bivariate correlation with spatial data, Geogr Anal, 23, 3, 210-227 (2014) · doi:10.1111/j.1538-4632.1991.tb00235.x
[16] Hastie, T.; Tibshirani, R.; Friedman, J., The elements of statistical learning: data mining, inference, and prediction (2009), Berlin: Springer, Berlin · Zbl 1273.62005 · doi:10.1007/978-0-387-84858-7
[17] Hofner, B.; Mayr, B.; Robinzonov, N.; Schmid, M., Model-based boosting in R: a hands-on tutorial using the R package mboost, Comput Stat, 29, 1-2, 3-35 (1991) · Zbl 1306.65069
[18] Loecher M and K Ropkins (2015) RgoogleMaps and loa: unleashing R graphics power on map tiles. J Stat Softw 63(4):1-18,
[19] Lotwick, HW; Silverman, BW, Methods for analysing spatial processes of several types of points, J R Stat Soc B, 44, 3, 406-413 (1982)
[20] Ripley BD (1977) Modelling spatial patterns. J R Stat Soc Ser B (Methodological) 172-212 · Zbl 0369.60061
[21] Traissac S (2003) Dynamique spatiale de Vouacapoua americana (Aublet), arbre de forêt tropicale humide à répartition agrégée. PhD Thesis. Université Claude Bernard-Lyon 1, Lyon
[22] Umlauf, N.; Klein, N.; Zeileis, A., BAMLSS: Bayesian additive models for location, scale, and shape (and beyond), J Comput Graph Stat, 27, 3, 612-627 (2018) · Zbl 07498937 · doi:10.1080/10618600.2017.1407325
[23] Wagner, M.; Zeileis, A., Heterogeneity and spatial dependence of regional growth in the EU: a recursive partitioning approach, Ger Econ Rev, 20, 1, 67-82 (2019) · doi:10.1111/geer.12146
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.