Statistical analysis of sparse approximate factor models.

*(English)*Zbl 07246819The authors consider a sequence of \(n\) i.i.d. observations of a \(p\)-dimensional random vector \((X_i)\), having the factor structure \(X_i=\Lambda\,F_i+\epsilon_i\), where \(\Lambda\) is the loading \(p\times m\) matrix, \(F_i\) is the vector of centred factor variables and \(\epsilon_i\) are the errors – the idiosyncratic variables. The dimension \(m>0\) is known. The variance is var\((X_i)=\Lambda\,\Lambda'+\Psi\). Factors and idiosyncratic variables are assumed to be uniformly sub-Gaussian. The authors provide \(\ell_1\)-, \(\ell_2\)- and \(\ell_{\infty}\)-error bounds. Their approach is based on a two-step estimation: first the matrices \(\Lambda\) and \(\Psi\) are obtained through a Gaussian quasi-maximum likelihood (QML) estimation (in this step \(\Psi\) is assumed to be diagonal). Conditionally on this first step estimation, the diagonality assumption on \(\Psi\) is relaxed, and by means of various regularisers, both Gaussian QML and least squares loss function are used to obtain a sparse error covariance matrix. The support recovery property is also established. The results are supported by simulations.

Reviewer: Carlo Sempi (Lecce)

##### MSC:

62H25 | Factor analysis and principal components; correspondence analysis |

62J07 | Ridge regression; shrinkage estimators (Lasso) |

62F12 | Asymptotic properties of parametric estimators |

##### Keywords:

approximate factor analysis; non-convex regulariser; statistical consistency; support recovery
PDF
BibTeX
XML
Cite

\textit{B. Poignard} and \textit{Y. Terada}, Electron. J. Stat. 14, No. 2, 3315--3365 (2020; Zbl 07246819)

**OpenURL**

##### References:

[1] | Abadir, K.M. and Magnus, J.R. (2005)., Matrix algebra. Cambridge University Press. · Zbl 1084.15001 |

[2] | Anderson, T.W. and Amemiya, Y. (1988)., The asymptotic normal distribution of estimators in factor analysis under general conditions. The Annals of Statistics, Vol. 16, No. 2, 759-771. · Zbl 0646.62051 |

[3] | Bai, J. (2003)., Inferential theory for factor models of large dimensions. Econometrica, Vol. 71, 135-171. · Zbl 1136.62354 |

[4] | Bai, J. and Li, K. (2012)., Statistical analysis of factor models of high dimension. The Annals of Statistics, Vol. 40, No. 1, 436-465. · Zbl 1246.62144 |

[5] | Bai, J. and Li, K. (2016)., Maximum likelihood estimation and inference for approximate factor models of high dimension. The Review of Economics and Statistics, Vol. 98, No. 2. |

[6] | Bai, J. and Liao, K. (2016)., Efficient estimation of approximate factor models via penalised maximum likelihood. Journal of Econometrics, Vol. 191, 1-18. · Zbl 1390.62107 |

[7] | Bühlmann, P. and van de Geer, S. (2011)., Statistics for high-dimensional data: methods, theory and applications. Berlin: Springer Series in Statistics. · Zbl 1273.62015 |

[8] | Candès, E.J and Plan, Y. (2009)., Near-ideal model selection by \(\ell_1\) minimization. The Annals of Statistics, Vol. 37, No. 5A, 2145-2177. · Zbl 1173.62053 |

[9] | Chamberlain, G., and Rothschild, M. (1983)., Arbitrage, factor structure and mean-variance analysis in large asset markets. Econometrica, Vol. 51, No. 5, 1305-1324 · Zbl 0523.90017 |

[10] | Fan, J., Fan, Y. and Lv, J. (2008)., Large dimensional covariance matrix estimation using a factor model. Journal of Econometrics, Vol. 147, 186-197. · Zbl 1429.62185 |

[11] | Fan, J., Feng, Y. and Wu, Y. (2009)., Network exploration via the adaptive lasso and scad penalties. The Annals of Applied Statistics, Vol. 3, No. 2, 521-541. · Zbl 1166.62040 |

[12] | Fan, J. and Li, R. (2001)., Variable selection via nonconcave penalised likelihood and its oracle properties. Journal of the American Statistical Association, Vol. 96, 1348-1360. · Zbl 1073.62547 |

[13] | Fan, J., Liao, Y. and Liu, H. (2016)., An overview of the estimation of large covariance and precision matrices. The Econometrics Journal, Vol. 19, 1-32. |

[14] | Fan, J., Liao, Y. and Mincheva, M. (2011)., High-dimensional covariance matrix estimation in approximate factor models. The Annals of Statistics, Vol. 39, No. 6, 3320-3356. · Zbl 1246.62151 |

[15] | Fan, J., Liao, Y. and Mincheva, M. (2013)., Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society Series B, Statistical Methodology, Vol. 75, No. 4. · Zbl 1411.62138 |

[16] | Fan, J., Zhang, J. and Yu, K. (2012)., Vast portfolio selection with gross-exposure constraints. Journal of the American Statistical Association, Vol. 107, 592-606. · Zbl 1261.62091 |

[17] | Goldberg, L.R. (1992), The development of markers for the Big-Five factor structure. Psychological assessment Vol. 4 No. 1, 26-42. |

[18] | Harman, H.H. (1967) Modern factor analysis (2nd ed.), University of Chicago, Press. · Zbl 0161.39805 |

[19] | Lawley, D.N.and Maxwell, A.E. (1971) Factor Analysis as a Statistical Method (2nd ed.), Elsevier. · Zbl 0251.62042 |

[20] | Loh, P.L. (2017)., Statistical consistency and asymptotic normality for high-dimensional robust M-estimators. The Annals of Statistics, Vol. 45, No. 2, 866-896. · Zbl 1371.62023 |

[21] | Loh, P.L. and Wainwright, M.J. (2015)., Regularised M-estimators with non-convexity: statistical and algorithmic theory for local optima. Journal of Machine Learning Research, Vol. 16, 559-616. · Zbl 1360.62276 |

[22] | Loh, P.L. and Wainwright, M.J. (2017)., Support recovery without incoherence: a case for non-convex regularisation. The Annals of Statistics, Vol. 45, No. 6, 2455-2482. · Zbl 1385.62008 |

[23] | Merlevède, F., Peligrad, M. and Rio, E. (2009)., Bernstein inequality and moderate deviations under strong mixing conditions. Institute of Mathematical Statistics Collections, High Dimensional Probability, Vol. 5, 273-292. · Zbl 1243.60019 |

[24] | Negahban, S.N, Ravikumar, P., Wainwright, M.J., and Yu, B. (2012)., A unified framework for high-dimensional analysis of M-estimators with decomposable regularisers. Statistical Science, Vol. 27, No. 4, 538-557. · Zbl 1331.62350 |

[25] | Poignard, B. and Fermanian, J.D. (2018)., Finite sample properties of Sparse M-estimators with Pseudo-Observations. Working Paper CREST. |

[26] | Ravikumar, P., Wainwright, M.J. and Lafferty, J.D. (2010)., High-dimensional Ising model selection using \(\ell_1\)-regularised logistic regression. The Annals of Statistisc, Vol. 38, 1287-1319. · Zbl 1189.62115 |

[27] | Ravikumar, P., Wainwright, M.J., Raskutti, G. and Yu, B. (2011)., High-dimensional covariance estimation by minimizing \(\ell_1\)-penalised log-determinant divergence. Electronic Journal of Statistics, Vol. 5, 935-980. · Zbl 1274.62190 |

[28] | Ross, S. A. (1976)., The arbitrage theory of capital asset pricing. Journal of Economic Theory, Vol. 13, 341-360. |

[29] | van de Geer, S. (2016)., Estimation and testing under sparsity. École d’Éte de Saint-Flour XLV, Springer. · Zbl 1362.62006 |

[30] | Wainwright, M.J. (2009)., Sharpe thresholds for high-dimensional and noisy sparsity recovery using \(\ell_1\)-constrained quadratic programming (Lasso). IEEE Transactions on Information Theory, Vol. 55, No. 5, 2183-2202. · Zbl 1367.62220 |

[31] | Zhang, C.-H. (2010)., Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, Vol. 38, 894-942. · Zbl 1183.62120 |

[32] | Zhao, P. · Zbl 1222.62008 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.