×

Revisiting Guerry’s data: introducing spatial constraints in multivariate analysis. (English) Zbl 1234.62092

Summary: Standard multivariate analysis methods aim to identify and summarize the main structures in large data sets containing the description of a number of observations by several variables. In many cases, spatial information is also available for each observation, so that a map can be associated to the multivariate data set. Two main objectives are relevant in the analysis of spatial multivariate data: summarizing covariation structures and identifying spatial patterns. In practice, achieving both goals simultaneously is a statistical challenge, and a range of methods have been developed that offer trade-offs between these two objectives. In an applied context, this methodological question has been and remains a major issue in community ecology, where species assemblages (i.e., covariation between species abundances) are often driven by spatial processes (and thus exhibit spatial patterns).
We review a variety of methods developed in community ecology to investigate multivariate spatial patterns. We present different ways of incorporating spatial constraints in multivariate analysis and illustrate these different approaches using the famous data set on moral statistics in France published by André-Michel Guerry in 1833. We discuss and compare the properties of these different approaches both from a practical and theoretical viewpoint.

MSC:

62H25 Factor analysis and principal components; correspondence analysis
62P12 Applications of statistics to environmental and related topics
62H11 Directional data; spatial statistics
62A09 Graphical methods in statistics

Software:

sedaR; ade4; Guerry
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Anselin, L. (1995). Local indicators of spatial association. Geographical Analysis 27 93-115.
[2] Anselin, L. (1996). The Moran scatterplot as an ESDA tool to assess local instability in spatial association. In Spatial Analytical Perspectives on GIS (M. M. Fischer, H. J. Scholten and D. Unwin, eds.) 111-125. Taylor and Francis, London.
[3] Anselin, L., Syabri, I. and Smirnov, O. (2002). Visualizing multivariate spatial correlation with dynamically linked windows. In New Tools for Spatial Data Analysis: Proceedings of a Workshop (L. Anselin and S. Rey, eds.). CSISS, Santa-Barbara, CA.
[4] Benali, H. and Escofier, B. (1990). Analyse factorielle lissée et analyse factorielle des différences locales. Rev. Statist. Appl. 38 55-76.
[5] Bivand, R. (2008). Implementing representations of space in economic geography. Journal of Regional Science 48 1-27.
[6] Blanchet, F. G., Legendre, P. and Borcard, D. (2008). Forward selection of explanatory variables. Ecology 89 2623-2632.
[7] Borcard, D., Legendre, P. and Drapeau, P. (1992). Partialling out the spatial component of ecological variation. Ecology 73 1045-1055. · Zbl 1286.92004
[8] Chatterjee, S. and Hadi, A. S. (1986). Influential observations, high leverage points, and outliers in linear regression. Statist. Sci. 1 379-393. · Zbl 0633.62059
[9] Cliff, A. D. and Ord, J. K. (1973). Spatial Autocorrelation . Pion, London.
[10] de Jong, P., Sprenger, C. and van Veen, F. (1984). On extreme values of Moran’s I and Geary’s c. Geographical Analysis 16 17-24.
[11] Dolédec, S. and Chessel, D. (1987). Rythmes saisonniers et composantes stationnelles en milieu aquatique I-Description d’un plan d’observations complet par projection de variables. Acta Oecologica-Oecologia Generalis 8 403-426.
[12] Dolédec, S. and Chessel, D. (1994). Co-inertia analysis: An alternative method for studying species-environment relationships. Freshwater Biology 31 277-294.
[13] Dray, S. and Jombort, T. (2010). Suplement to “Revisiting Guerry’s data: Introducing spatial constraints in multivariate analysis.” .
[14] Dray, S., Chessel, D. and Thioulouse, J. (2003a). Co-inertia analysis and the linking of ecological data tables. Ecology 84 3078-3089.
[15] Dray, S., Chessel, D. and Thioulouse, J. (2003b). Procrustean co-inertia analysis for the linking of multivariate data sets. Ecoscience 10 110-119.
[16] Dray, S. and Dufour, A. B. (2007). The ade4 package: Implementing the duality diagram for ecologists. J. Statist. Soft. 22 1-20.
[17] Dray, S., Legendre, P. and Peres-Neto, P. R. (2006). Spatial modeling: A comprehensive framework for principal coordinate analysis of neighbor matrices (PCNM). Ecological Modelling 196 483-493.
[18] Dray, S., Pettorelli, N. and Chessel, D. (2003). Multivariate analysis of incomplete mapped data. Transactions in GIS 7 411-422.
[19] Dray, S., Saïd, S. and Débias, F. (2008). Spatial ordination of vegetation data using a generalization of Wartenberg’s multivariate spatial correlation. Journal of Vegetation Science 19 45-56.
[20] Dykes, J. and Brunsdon, C. (2007). Geographically weighted visualization: Interactive graphics for scale-varying exploratory analysis. IEEE Transactions on Visualization and Computer Graphics 13 1161-1168.
[21] Escoufier, Y. (1987). The duality diagram: A means of better practical applications. In Developments in Numerical Ecology (P. Legendre and L. Legendre, eds.) 14 139-156. Springer, Berlin.
[22] Fall, A., Fortin, M. J., Manseau, M. and O’Brien, D. (2007). Spatial graphs: Principles and applications for habitat connectivity. Ecosystems 10 448-461.
[23] Friendly, M. (2007). A.-M. Guerry’s moral statistics of France: Challenges for multivariable spatial analysis. Statist. Sci. 22 368-399. · Zbl 1246.91004
[24] Gabriel, K. R. (1971). The biplot graphic display of matrices with application to principal component analysis. Biometrika 58 453-467. · Zbl 0228.62034
[25] Geary, R. C. (1954). The contiguity ratio and statistical mapping. The Incorporated Statistician 5 115-145.
[26] Getis, A. and Aldstadt, J. (2004). Constructing the spatial weights matrix using a local statistic. Geographical Analysis 36 90-104.
[27] Getis, A. and Griffith, D. A. (2002). Comparative spatial filtering in regression analysis. Geographical Analysis 34 130-140.
[28] Goodall, D. W. (1954). Objective methods for the classification of vegetation III. An essay on the use of factor analysis. Australian Journal of Botany 2 304-324.
[29] Greenacre, M. J. (1984). Theory and Applications of Correspondence Analysis . Academic Press, London. · Zbl 0555.62005
[30] Griffith, D. A. (1996). Spatial autocorrelation and eigenfunctions of the geographic weights matrix accompanying geo-referenced data. Canadian Geographer 40 351-367.
[31] Griffith, D. A. (2000). A linear regression solution to the spatial autocorrelation problem. Journal of Geographical Systems 2 141-156.
[32] Griffith, D. A. (2002). A spatial filtering specification for the auto-Poisson model. Statist. Probab. Lett. 58 245-251. · Zbl 1045.62050
[33] Griffith, D. A. (2003). Spatial Autocorrelation and Spatial Filtering: Gaining Understanding Through Theory and Scientific Visualization . Springer, Berlin.
[34] Griffith, D. A. (2004). A spatial filtering specification for the autologistic model. Environment and Planning A 36 1791-1811.
[35] Guérry, A. M. (1833). Essai sur la Statistique Morale de la France . Crochard, Paris.
[36] Haining, R. (1990). Spatial Data Analysis in the Social and Environmental Sciences . Cambridge Univ. Press.
[37] Holmes, S. (2006). Multivariate analysis: The French way. In Festschrift for David Freedman (D. Nolan and T. Speed, eds.). IMS, Beachwood, OH. · Zbl 1166.62310
[38] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 24 417-441. · JFM 59.1182.04
[39] Jaromczyk, J. W. and Toussaint, G. T. (1992). Relative neighborhood graphs and their relatives. Proceedings of the IEEE 80 1502-1517.
[40] Jombart, T., Dray, S. and Dufour, A. B. (2009). Finding essential scales of spatial variation in ecological data: A multivariate approach. Ecography 32 161-168.
[41] Le Foll, Y. (1982). Pondération des distances en analyse factorielle. Statistique et Analyse des données 7 13-31. · Zbl 0511.62060
[42] Lebart, L. (1969). Analyse statistique de la contiguïté. Publication de l’Institut de Statistiques de l’Université de Paris 28 81-112. · Zbl 0223.62129
[43] Legendre, P. (1993). Spatial autocorrelation: Trouble or new paradigm? Ecology 74 1659-1673.
[44] Legendre, P. and Legendre, L. (1998). Numerical Ecology , 2nd ed. Elsevier, Amsterdam. · Zbl 1033.92036
[45] Moran, P. A. P. (1948). The interpretation of statistical maps. J. Roy. Statist. Soc. Ser. B 10 243-251. · Zbl 0035.08304
[46] Méot, A., Chessel, D. and Sabatier, R. (1993). Opérateurs de voisinage et analyse des données spatio-temporelles. In Biométrie et environnement (J. D. Lebreton and B. Asselain, eds.) 45-72. Masson, Paris.
[47] Norcliffe, G. B. (1969). On the use and limitations of trend surface models. Canadian Geographer 13 338-348.
[48] Peres-Neto, P. R. and Jackson, D. A. (2001). How well do multivariate data sets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia 129 169-178.
[49] Rao, C. R. (1964). The use and interpretation of principal component analysis in applied research. Sankhyā Ser. A 26 329-359. · Zbl 0137.37207
[50] Student (1914). The elimination of spurious correlation due to position in time or space. Biometrika 10 179-180.
[51] ter Braak, C. J. F. (1986). Canonical correspondence analysis: A new eigenvector technique for multivariate direct gradient analysis. Ecology 67 1167-1179.
[52] Tiefelsdorf, M., Griffith, D. A. and Boots, B. (1999). A variance-stabilizing coding scheme for spatial link matrices. Environment and Planning A 31 165-180.
[53] Tiefelsdorf, M. and Griffith, D. A. (2007). Semi-parametric filtering of spatial autocorrelation: The eigenvector approach. Environment and Planning A 39 1193-1221.
[54] Torre, F. and Chessel, D. (1995). Co-structure de deux tableaux totalement appariés. Revue de Statistique Appliquée 43 109-121.
[55] Tukey, J. W. (1977). Exploratory Data Analysis . Addison-Wesley, Reading, MA. · Zbl 0409.62003
[56] van den Wollenberg, A. L. (1977). Redundancy analysis, an alternative for canonical analysis. Psychometrika 42 207-219. · Zbl 0354.92050
[57] Wartenberg, D. (1985). Multivariate spatial correlation: A method for exploratory geographical analysis. Geographical Analysis 17 263-283.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.