Testing for principal component directions under weak identifiability. (English) Zbl 1439.62075
Given data from a multivariate normal distribution whose covariance matrix has eigenvalues $$\lambda_1\geq\lambda_2\geq\cdots\geq\lambda_p$$, consider testing the null hypothesis $$\mbox{H}_0:\boldsymbol{\theta}_1=\boldsymbol{\theta}_1^0$$ against the alternative hypothesis $$\mbox{H}_1:\boldsymbol{\theta}_1\not=\boldsymbol{\theta}_1^0$$, where $$\boldsymbol{\theta}_1$$ is the eigenvector associated with $$\lambda_1$$ and $$\boldsymbol{\theta}_1^0$$ is a fixed vector. The authors compare two tests in this setting: a classical likelihood ratio test and the Le Cam optimal test due to M. Hallin et al. [Ann. Stat. 38, No. 6, 3245–3299 (2010; Zbl 1373.62295)]. When the eigenvalues $$\lambda_i$$ are fixed, these two tests are known to be asymptotically equivalent under the null hypothesis and sequences of contiguous alternatives. In this paper, the authors show that this asymptotic equivalence breaks down in the setting where the eigenvalues may depend on the sample size $$n$$ and $$\lambda_1/\lambda_2=1+O(r_n)$$, with $$r_n=O(1/\sqrt{n})$$. In this setting, the likelihood ratio test is shown to over-reject the null hypothesis, so that the Le Cam optimal test is preferable here. Further properties of this latter test are investigated to show that this gain over the likelihood ratio test does not come at the expense of power. The more general setting of elliptical data is also considered, and numerical examples (based on both simulations and real data) are presented to illustrate the findings of the paper.

##### MSC:
 62F05 Asymptotic properties of parametric tests 62F03 Parametric hypothesis testing 62H25 Factor analysis and principal components; correspondence analysis 62E20 Asymptotic distribution theory in statistics
##### Software:
ROBPCA; TCLUST; uskewFactors
Full Text:
