Bayesian inference for spectral projectors of the covariance matrix. (English) Zbl 1406.62064
Summary: Let \(X_{1},\ldots,X_{n}\) be an i.i.d. sample in \(\mathbb{R}^{p}\) with zero mean and the covariance matrix \(\boldsymbol{\Sigma}^\ast\). The classical PCA approach recovers the projector \(\boldsymbol{P}^\ast_{\mathcal{J}}\) onto the principal eigenspace of \(\boldsymbol{\Sigma}^\ast\) by its empirical counterpart \(\widehat{\boldsymbol{P}}_{\mathcal{J}}\). Recent paper [V. Koltchinskii and K. Lounici, Ann. Stat. 45, No. 1, 121–157 (2017; Zbl 1367.62175)] investigated the asymptotic distribution of the Frobenius distance between the projectors \(\|\widehat{\boldsymbol{P}}_{\mathcal{J}}-\boldsymbol{P}^\ast_{\mathcal{J}}\|_{2}\), while [A. Naumov et al., “Bootstrap confidence sets for spectral projectors of sample covariance”, Preprint, arXiv:1703.00871] offered a bootstrap procedure to measure uncertainty in recovering this subspace \(\boldsymbol{P}^\ast_{\mathcal{J}}\) even in a finite sample setup. The present paper considers this problem from a Bayesian perspective and suggests to use the credible sets of the pseudo-posterior distribution on the space of covariance matrices induced by the conjugated Inverse Wishart prior as sharp confidence sets. This yields a numerically efficient procedure. Moreover, we theoretically justify this method and derive finite sample bounds on the corresponding coverage probability. Contrary to [Koltchinskii and Lounici, loc. cit.; Naumov et al., loc. cit.], the obtained results are valid for non-Gaussian data: the main assumption that we impose is the concentration of the sample covariance \(\widehat{\boldsymbol{\Sigma}}\) in a vicinity of \(\boldsymbol{\Sigma}^\ast\). Numerical simulations illustrate good performance of the proposed procedure even on non-Gaussian data in a rather challenging regime.
