×

Elliptical insights: understanding statistical methods through elliptical geometry. (English) Zbl 1332.62015

Summary: Visual insights into a wide variety of statistical methods, for both didactic and data analytic purposes, can often be achieved through geometric diagrams and geometrically based statistical graphs. This paper extols and illustrates the virtues of the ellipse and her higher-dimensional cousins for both these purposes in a variety of contexts, including linear models, multivariate linear models and mixed-effect models. We emphasize the strong relationships among statistical methods, matrix-algebraic solutions and geometry that can often be easily understood in terms of ellipses.

MSC:

62A09 Graphical methods in statistics
62F15 Bayesian inference

Software:

mvmeta; SAS; Guerry; carData; car
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Alker, H. R. (1969). A typology of ecological fallacies. In Social Ecology (M. Dogan and S. Rokkam, eds.) 69-86. MIT Press, Cambridge, MA.
[2] Anderson, E. (1935). The irises of the Gaspé peninsula. Bulletin of the American Iris Society 35 2-5.
[3] Antczak-Bouckoms, A., Joshipura, K., Burdick, E. and Tulloch, J. F. (1993). Meta-analysis of surgical versus non-surgical methods of treatment for periodontal disease. J. Clin. Periodontol. 20 259-268.
[4] Beaton, A. E. Jr (1964). The use of special matrix operators in statistical calculus. Ph.D. thesis, Harvard Univ., ProQuest LLC, Ann Arbor, MI.
[5] Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics : Identifying Influential Data and Sources of Collinearity . Wiley, New York. · Zbl 0479.62056
[6] Berkey, C. S., Hoaglin, D. C., Antczak-Bouckoms, A., Mosteller, F. and Colditz, G. A. (1998). Meta-analysis of multiple outcomes by regression with random effects. Stat. Med. 17 2537-2550.
[7] Boyer, C. B. (1991). Apollonius of Perga. In A History of Mathematics , 2nd ed. 156-157. Wiley, New York.
[8] Bravais, A. (1846). Analyse mathématique sur les probabilités des erreurs de situation d’un point. Mémoires Présentés Par Divers Savants à L’Académie Royale des Sciences de l’Institut de France 9 255-332.
[9] Bryant, P. (1984). Geometry, statistics, probability: Variations on a common theme. Amer. Statist. 38 38-48.
[10] Bryk, A. S. and Raudenbush, S. W. (1992). Hierarchical Linear Models : Applications and Data Analysis Methods . Sage, Thousand Oaks, CA. · Zbl 1001.62004
[11] Campbell, N. A. and Atchley, W. R. (1981). The geometry of canonical variate analysis. Systematic Zoology 30 268-280.
[12] Cramér, H. (1946). Mathematical Methods of Statistics. Princeton Mathematical Series 9 . Princeton Univ. Press, Princeton, NJ. · Zbl 0063.01014
[13] Dempster, A. P. (1969). Elements of Continuous Multivariate Analysis . Addison-Wesley, Reading, MA. · Zbl 0197.44904
[14] Denis, D. (2001). The origins of correlation and regression: Francis Galton or Auguste Bravais and the error theorists. History and Philosophy of Psychology Bulletin 13 36-44.
[15] Diez-Roux, A. V. (1998). Bringing context back into epidemiology: Variables and fallacies in multilevel analysis. Am. J. Public Health 88 216-222.
[16] Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics 8 379-388.
[17] Fox, J. (2008). Applied Regression Analysis and Generalized Linear Models , 2nd ed. Sage, Thousand Oaks, CA.
[18] Fox, J. and Suschnigg, C. (1989). A note on gender and the prestige of occupations. Canadian Journal of Sociology 14 353-360.
[19] Fox, J. and Weisberg, S. (2011). An R Companion to Applied Regression , 2nd ed. Sage, Thousand Oaks, CA.
[20] Friendly, M. (1991). SAS System for Statistical Graphics , 1st ed. SAS Institute, Cary, NC.
[21] Friendly, M. (2007a). A.-M. Guerry’s Moral statistics of France : Challenges for multivariable spatial analysis. Statist. Sci. 22 368-399. · Zbl 1246.91004 · doi:10.1214/07-STS241
[22] Friendly, M. (2007b). HE plots for multivariate linear models. J. Comput. Graph. Statist. 16 421-444. · doi:10.1198/106186007X208407
[23] Friendly, M. (2013). The generalized ridge trace plot: Visualizing bias and precision. J. Comput. Graph. Statist. 22 .
[24] Friendly, M., Monette, G. and Fox, J. (2012). Supplement to “Elliptical Insights: Understanding Statistical Methods Through Elliptical Geometry.” . · Zbl 1332.62015 · doi:10.1214/12-STS402
[25] Galton, F. (1886). Regression towards mediocrity in hereditary stature. Journal of the Anthropological Institute 15 246-263.
[26] Galton, F. (1889). Natural Inheritance . Macmillan, London.
[27] Gasparrini, A. (2012). MVMETA : multivariate meta-analysis and meta-regression . R package version 0.2.4.
[28] Guerry, A. M. (1833). Essai sur la statistique morale de la France . Crochard, Paris. [English translation: Hugh P. Whitt and Victor W. Reinking, Edwin Mellen Press, Lewiston, NY (2002).]
[29] Henderson, C. R. (1975). Best linear unbiased estimation and prediction under a selection model. Biometrics 31 423-448. · Zbl 0335.62048 · doi:10.2307/2529430
[30] Hoerl, A. E. and Kennard, R. W. (1970a). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 55-67. · Zbl 0202.17205 · doi:10.2307/1267351
[31] Hoerl, A. E. and Kennard, R. W. (1970b). Ridge regression: Applications to nonorthogonal problems. Technometrics 12 69-82. [Correction: 12 723.] · Zbl 0202.17206 · doi:10.2307/1267352
[32] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 24 417-441.
[33] Jackson, D., Riley, R. and White, I. R. (2011). Multivariate meta-analysis: Potential and promise. Stat. Med. 30 2481-2498. · doi:10.1002/sim.4247
[34] Kramer, G. H. (1983). The ecological fallacy revisited: Aggregate-versus individual-level findings on economics and elections, and sociotropic voting. The American Political Science Review 77 92-111.
[35] Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics 38 963-974. · Zbl 0512.62107 · doi:10.2307/2529876
[36] Lichtman, A. J. (1974). Correlation, regression, and the ecological fallacy: A critique. The Journal of Interdisciplinary History 4 417-433.
[37] Longley, J. W. (1967). An appraisal of least squares programs for the electronic computer from the point of view of the user. J. Amer. Statist. Assoc. 62 819-841. · doi:10.1080/01621459.1967.10500896
[38] Marquardt, D. W. (1970). Generalized inverses, ridge regression, biased linear estimation, and nonlinear estimation. Technometrics 12 591-612. · Zbl 0205.46102 · doi:10.2307/1267205
[39] Monette, G. (1990). Geometry of multiple regression and interactive 3-D graphics. In Modern Methods of Data Analysis , Chapter 5 (J. Fox and S. Long, eds.) 209-256. Sage, Beverly Hills, CA.
[40] Nam, I.-S., Mengersen, K. and Garthwaite, P. (2003). Multivariate meta-analysis. Stat. Med. 22 2309-2333.
[41] Pearson, K. (1896). Contributions to the mathematical theory of evolution-III, regression, heredity and panmixia. Philosophical Transactions of the Royal Society of London 187 253-318.
[42] Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine 6 559-572.
[43] Pearson, K. (1920). Notes on the history of correlation. Biometrika 13 25-45.
[44] Raudenbush, S. W. and Bryk, A. S. (2002). Hierarchical Linear Models : Applications and Data Analysis Methods , 2nd ed. Sage, Newbury Park, CA. · Zbl 1137.62037 · doi:10.1111/j.1541-0420.2007.00818.x
[45] Riley, M. W. (1963). Special problems of sociological analysis. In Sociological Research I : A Case Approach (M. W. Riley, ed.) 700-725. Harcourt, Brace, and World, New York.
[46] Robinson, W. S. (1950). Ecological correlations and the behavior of individuals. American Sociological Review 15 351-357.
[47] Robinson, G. K. (1991). That BLUP is a good thing: The estimation of random effects. Statist. Sci. 6 15-32. · Zbl 0955.62500 · doi:10.1214/ss/1177011926
[48] Saville, D. and Wood, G. (1991). Statistical Methods : The Geometric Approach. Springer Texts in Statistics . Springer, New York. · Zbl 0747.62005 · doi:10.1007/978-1-4612-0971-3
[49] Simpson, E. H. (1951). The interpretation of interaction in contingency tables. J. Roy. Statist. Soc. Ser. B. 13 238-241. · Zbl 0045.08802
[50] Speed, T. (1991). That BLUP is a good thing: The estimation of random effects: Comment. Statist. Sci. 6 42-44. · Zbl 0955.62500 · doi:10.1214/ss/1177011926
[51] Stigler, S. M. (1986). The History of Statistics : The Measurement of Uncertainty Before 1900. The Belknap Press of Harvard Univ. Press, Cambridge, MA. · Zbl 0656.62005
[52] Timm, N. H. (1975). Multivariate Analysis with Applications in Education and Psychology . Brooks/Cole Publishing Co., Monterey, CA.
[53] Velleman, P. F. and Welsh, R. E. (1981). Efficient computing of regression diagnostics. Amer. Statist. 35 234-242. · Zbl 0475.65099 · doi:10.2307/2683296
[54] von Humboldt, A. (1811). Essai Politique sur le Royaume de la Nouvelle-Espagne . ( Political Essay on the Kingdom of New I : Founded on Astronomical Observations , and Trigonometrical and Barometrical Measurements ), Vol. 1. Riley, New York.
[55] Wickens, T. D. (1995). The Geometry of Multivariate Statistics . Lawrence Erlbaum Associates, Hillsdale, NJ. · Zbl 0916.62040
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.