Hallin, Marc; Paindaveine, Davy; Verdebout, Thomas Optimal rank-based tests for common principal components. (English) Zbl 1457.62182 Bernoulli 19, No. 5B, 2524-2556 (2013). Summary: This paper provides optimal testing procedures for the \(m\)-sample null hypothesis of Common Principal Components (CPC) under possibly non-Gaussian and heterogeneous elliptical densities. We first establish, under very mild assumptions that do not require finite moments of order four, the local asymptotic normality (LAN) of the model. Based on that result, we show that the pseudo-Gaussian test proposed in [M. Hallin et al., J. Nonparametric Stat. 22, No. 7, 879–895 (2010; Zbl 1332.62193)] is locally and asymptotically optimal under Gaussian densities, and show how to compute its local powers. A numerical evaluation of those powers, however, reveals that, while remaining valid, this test is poorly efficient away from the Gaussian. Moreover, it still requires finite moments of order four. We therefore propose rank-based procedures that remain valid under any possibly heterogeneous \(m\)-tuple of elliptical densities, irrespective of the existence of any moments. In elliptical families, indeed, principal components naturally can be based on the scatter matrices characterizing the density contours, hence do not require finite variances. Those rank-based tests, as usual, involve score functions, which may or may not be associated with a reference density at which they achieve optimality. A major advantage of our rank tests is that they are not only validity-robust, in the sense of surviving arbitrary elliptical population densities: unlike their pseudo-Gaussian counterparts, they also are efficiency-robust, in the sense that their local powers do not deteriorate away from the reference density at which they are optimal. We show, in particular, that in the homokurtic case, their normal-score version uniformly dominates, in the Pitman sense, the aforementioned pseudo-Gaussian generalization of Flury’s test. Theoretical results are obtained via a nonstandard application of Le Cam’s methodology in the context of curved LAN experiments. The finite-sample properties of the proposed tests are investigated via simulations. Cited in 10 Documents MSC: 62H15 Hypothesis testing in multivariate analysis 62H25 Factor analysis and principal components; correspondence analysis 62G10 Nonparametric hypothesis testing Keywords:common principal components; local asymptotic normality; rank-based methods; robustness Citations:Zbl 1332.62193 PDFBibTeX XMLCite \textit{M. Hallin} et al., Bernoulli 19, No. 5B, 2524--2556 (2013; Zbl 1457.62182) Full Text: DOI References: [1] Airoldi, J.P. and Hoffmann, R.S. (1984). Age variation in voles ( Microtus californicus and Microtus ochrogaster ) and its significance for systematic studies. Occasional Papers of the Museum of the Natural History , University of Kansas , Lawrence 111 1-45. [2] Anderson, T.W. (2003). An Introduction to Multivariate Statistical Analysis , 3rd ed. Wiley Series in Probability and Statistics . Hoboken, NJ: Wiley-Interscience [John Wiley & Sons]. · Zbl 1039.62044 [3] Bentler, P.M. and Dudgeon, P. (1996). Covariance structure analysis: Statistical practice, theory, and directions. Annu. Rev. Psych 47 563-592. [4] Boente, G. and Orellana, L. (2001). A robust approach to common principal components. In Statistics in Genetics and in the Environmental Sciences ( Ascona , 1999). Trends Math. (L.T. Fernholz, S. Morgenthaler and W. Stahel, eds.) 117-145. Basel: Birkhäuser. · doi:10.1007/978-3-0348-8326-9_9 [5] Boente, G. and Orellana, L. (2004). Robust plug-in estimators in proportional scatter models. J. Statist. Plann. Inference 122 95-110. · Zbl 1040.62022 · doi:10.1016/j.jspi.2003.06.006 [6] Boente, G., Pires, A.M. and Rodrigues, I.M. (2002). Influence functions and outlier detection under the common principal components model: A robust approach. Biometrika 89 861-875. · Zbl 1036.62050 · doi:10.1093/biomet/89.4.861 [7] Boente, G., Pires, A.M. and Rodrigues, I.M. (2009). Robust tests for the common principal components model. J. Statist. Plann. Inference 139 1332-1347. · Zbl 1153.62049 · doi:10.1016/j.jspi.2008.05.052 [8] Boik, R.J. (2002). Spectral models for covariance matrices. Biometrika 89 159-182. · Zbl 0997.62046 · doi:10.1093/biomet/89.1.159 [9] Browne, M.W. (1984). The decomposition of multitrait-multimethod matrices. British J. Math. Statist. Psych. 37 1-21. · Zbl 0551.62073 · doi:10.1111/j.2044-8317.1984.tb00785.x [10] Chernoff, H. and Savage, I. R. (1958). Asymptotic normality and efficiency of certain nonparametric test statistics. Ann. Math. Statist. 29 972-994. · Zbl 0092.36501 · doi:10.1214/aoms/1177706436 [11] Flury, B. and Riedwyl, H. (1988). Multivariate Statistics : A Practical Approach . New York: Chapman & Hall. · Zbl 0495.62057 [12] Flury, B.K. (1987). Two generalizations of the common principal component model. Biometrika 74 59-69. · Zbl 0613.62076 · doi:10.1093/biomet/74.1.59 [13] Flury, B.N. (1984). Common principal components in \(k\) groups. J. Amer. Statist. Assoc. 79 892-898. [14] Flury, B.N. (1986). Asymptotic theory for common principal component analysis. Ann. Statist. 14 418-430. · Zbl 0613.62075 · doi:10.1214/aos/1176349930 [15] Flury, B.N. and Gautschi, W. (1986). An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. SIAM J. Sci. Statist. Comput. 7 169-184. · Zbl 0614.65043 · doi:10.1137/0907013 [16] Hallin, M. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. I. Optimal rank-based tests for sphericity. Ann. Statist. 34 2707-2756. · Zbl 1114.62066 · doi:10.1214/009053606000000731 [17] Hallin, M., Oja, H. and Paindaveine, D. (2006). Semiparametrically efficient rank-based inference for shape. II. Optimal \(R\)-estimation of shape. Ann. Statist. 34 2757-2789. · Zbl 1115.62059 · doi:10.1214/009053606000000948 [18] Hallin, M. and Paindaveine, D. (2006). Parametric and semiparametric inference for shape: The role of the scale functional. Statist. Decisions 24 327-350. · Zbl 1111.62002 · doi:10.1524/stnd.2006.24.3.327 [19] Hallin, M. and Paindaveine, D. (2008). A general method for constructing pseudo-Gaussian tests. J. Japan Statist. Soc. 38 27-39. · Zbl 1360.62288 · doi:10.14490/jjss.38.27 [20] Hallin, M., Paindaveine, D. and Verdebout, T. (2010). Optimal rank-based testing for principal components. Ann. Statist. 38 3245-3299. · Zbl 1373.62295 · doi:10.1214/10-AOS810 [21] Hallin, M., Paindaveine, D. and Verdebout, T. (2010). Testing for common principal components under heterokurticity. J. Nonparametr. Stat. 22 879-895. · Zbl 1332.62193 · doi:10.1080/10485250903548737 [22] Hallin, M. and Werker, B.J.M. (2003). Semi-parametric efficiency, distribution-freeness and invariance. Bernoulli 9 137-165. · Zbl 1020.62042 · doi:10.3150/bj/1068129013 [23] Hettmansperger, T.P. and Randles, R.H. (2002). A practical affine equivariant multivariate median. Biometrika 89 851-860. · Zbl 1036.62045 · doi:10.1093/biomet/89.4.851 [24] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. J. Educ. Psych. 24 417-441. · JFM 59.1182.04 [25] Kreiss, J.P. (1987). On adaptive estimation in stationary ARMA processes. Ann. Statist. 15 112-133. · Zbl 0616.62042 · doi:10.1214/aos/1176350256 [26] Le Cam, L. (1986). Asymptotic Methods in Statistical Decision Theory. Springer Series in Statistics . New York: Springer. · Zbl 0605.62002 [27] Le Cam, L. and Yang, G.L. (2000). Asymptotics in Statistics : Some Basic Concepts , 2nd ed. Springer Series in Statistics . New York: Springer. · Zbl 0952.62002 [28] Muirhead, R.J. and Waternaux, C.M. (1980). Asymptotic distributions in canonical correlation analysis and other multivariate procedures for nonnormal populations. Biometrika 67 31-43. · Zbl 0448.62037 · doi:10.1093/biomet/67.1.31 [29] Paindaveine, D. (2006). A Chernoff-Savage result for shape: On the non-admissibility of pseudo-Gaussian methods. J. Multivariate Anal. 97 2206-2220. · Zbl 1101.62045 · doi:10.1016/j.jmva.2005.08.005 [30] Paindaveine, D. (2008). A canonical definition of shape. Statist. Probab. Lett. 78 2240-2247. · Zbl 1283.62124 [31] Pearson, K. (1901). On lines and planes of closest fit to system of points in space. Philos. Mag. 2 559-572. · JFM 32.0710.04 [32] Rao, C.R. and Mitra, S.K. (1971). Generalized Inverse of Matrices and Its Applications . New York: Wiley. · Zbl 0236.15004 [33] Satorra, A. and Bentler, P.M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis. In Proceedings of the Business and Economic Statistics Section of the American Statistical Association 308-313. Alexandria, VA: American Statistical Association. [34] Shapiro, A. and Browne, M.W. (1987). Analysis of covariance structures under elliptical distributions. J. Amer. Statist. Assoc. 82 1092-1097. · Zbl 0645.62056 · doi:10.2307/2289385 [35] Van der Vaart, A.W. (2000). Asymptotic Statistics . Cambridge: Cambridge Univ. Press. · Zbl 0910.62001 [36] Wilks, S.S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann. Math. Statist. 9 60-62. · Zbl 0018.32003 · doi:10.1214/aoms/1177732360 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.