×

On factor models with random missing: EM estimation, inference, and cross validation. (English) Zbl 1471.62532

Summary: We consider the estimation and inference in approximate factor models with random missing values. We show that with the low rank structure of the common component, we can estimate the factors and factor loadings consistently with the missing values replaced by zeros. We establish the asymptotic distributions of the resulting estimators and those based on the EM algorithm. We also propose a cross-validation-based method to determine the number of factors in factor models with or without missing values and justify its consistency. Simulations demonstrate that our cross validation method is robust to fat tails in the error distribution and significantly outperforms some existing popular methods in terms of correct percentage in determining the number of factors. An application to the factor-augmented regression models shows that a proper treatment of the missing values can improve the out-of-sample forecast of some macroeconomic variables.

MSC:

62P20 Applications of statistics to economics
62H25 Factor analysis and principal components; correspondence analysis
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62H12 Estimation in multivariate analysis
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Ahn, S.; Horenstein, A., Eigenvalue ratio test for the number of factors, Econometrica, 81, 1203-1227 (2013) · Zbl 1274.62403
[2] Athey, S.; Bayati, M.; Doudchenko, N.; Imbens, G.; Khosravi, K., Matrix Completion Methods for Causal Panel Data ModelsWorking Paper (2018), Graduate School of Business, Stanford University
[3] Bai, J., Inferential theory for factors models of large dimensions, Econometrica, 71, 135-173 (2003) · Zbl 1136.62354
[4] Bai, J.; Li, K., Maximum likelihood estimation and inference for approximate factor models of high dimension, Rev. Econ. Stat., 98, 298-309 (2016)
[5] Bai, J.; Liao, Y.; Yang, J., Unbalanced panel data models with interactive effects, (Baltagi, B. H., The Oxford Handbook of Panel Data (2015)), 149-170
[6] Bai, J.; Ng, S., Determining the number of factors in approximate factor models, Econometrica, 70, 191-221 (2002) · Zbl 1103.91399
[7] Bai, J.; Ng, S., Rank regularized estimation of approximate factor models, J. Econometrics, 212, 78-96 (2019) · Zbl 1452.62405
[8] Bai, J.; Ng, S., Matrix Completion, Counterfactuals, and Factor Analysis of Missing DataWorking paper (2019), Department of Economics, Columbia University
[9] Bańbura, M.; Modugno, M., Maximum likelihood estimation of factor models on datasets with arbitrary pattern of missing data, J. Appl. Econometrics, 29, 133-160 (2014)
[10] Cai, J.-F.; Candès, E. J.; Shen, Z., A singular value thresholding algorithm for matrix completion, SIAM J. Optim., 20, 1956-1982 (2010) · Zbl 1201.90155
[11] Candès, E. J.; Li, X., Robust principal component analysis, J. ACM, 58, 3 (2011), 11:1-37 · Zbl 1327.62369
[12] Candès, E. J.; Plan, Y., Matrix completion with noise, Proc. IEEE, 98, 6, 925-936 (2010)
[13] Chamberlain, G.; Rothschild, M., Arbitrage, factor structure, and mean-variance analysis on large asset markets, Econometrica, 51, 5, 1281-1304 (1983) · Zbl 0523.90017
[14] Doz, C.; Giannone, D.; Reichlin, L., A two-step estimator for large approximate dynamic factor models based on kalman filtering, J. Econometrics, 164, 188-205 (2011) · Zbl 1441.62671
[15] Fan, J.; Liao, Y.; Mincheva, M., Large covariance estimation by thresholding principal orthogonal complements, J. R. Stat. Soc. Ser. B Stat. Methodol., 75, 603-680 (2013) · Zbl 1411.62138
[16] Forni, M.; Hallin, M.; Lippi, M.; Reichlin, L., The generalized dynamic factor model: identification and estimation, Rev. Econ. Stat., 82, 540-554 (2000)
[17] Foroni, C.; Marcellino, M. G., A survey of econometric methods for mixed-frequency data, SSRN Electron. J. (2013)
[18] Geweke, J. F., The dynamic factor analysis of economic time series models, (Aigner, D.; Goldberger, A., Latent Variables in Socioeconomic Models (1977), North-Holland: North-Holland Amsterdam), 365-383 · Zbl 0389.62075
[19] Giannone, D.; Reichlin, L.; Small, D., Nowcasting: The real-time informational content of macroeconomic data, J. Monetary Econ., 55, 665-676 (2008)
[20] Hallin, M.; Liśka, R., Determining the number of factors in the general dynamic factor model, J. Amer. Statist. Assoc., 102, 603-617 (2007) · Zbl 1172.62339
[21] Häusler, E.; Luschgy, H., Stable Convergence and Stable Limit Theorems (2015), Springer: Springer New York · Zbl 1356.60004
[22] Jungbacker, B.; Koopman, S.; Wel, M. V.D., Maximum likelihood estimation for dynamic factor models with missing data, J. Econom. Dynam. Control, 35, 1358-1368 (2011) · Zbl 1217.91153
[23] Lu, X.; Su, L., Shrinkage estimation of dynamic panel data models with interactive fixed effects, J. Econometrics, 190, 148-175 (2016) · Zbl 1419.62516
[24] Ludvigson, S.; Ng, S., The empirical risk-return relation: a factor analysis approach, J. Financ. Econ., 83, 171-222 (2007)
[25] Marcellino, M.; Sivec, V., Monetary, fiscal, and oil shocks: evidence based on mixed frequency structural FAVARs, J. Econometrics, 193, 335-348 (2016) · Zbl 1431.91298
[26] Mariano, R. S.; Murasawa, Y., A coincident index, common factors, and monthly real GDP, Oxford Bulletin of Economics and Statistics, 72, 27-46 (2010)
[27] McCracken, M. W.; Ng, S., FRED-MD: A monthly database for macroeconomic research, J. Bus. Econom. Statist., 34, 4, 574-589 (2016)
[28] Meinshausen, N.; Bühlmann, S., Stability selection, J. R. Stat. Soc. Ser. B Stat. Methodol., 72, 417-473 (2010) · Zbl 1411.62142
[29] Moon, H. R.; Weidner, M., Dynamic linear panel regression models with interactive fixed effects, Econometric Theory, 33, 158-195 (2017) · Zbl 1441.62816
[30] Negahban, S.; Wainwright, M. J., Restricted strong convexity and (weighted) matrix completion: optimal bounds with noise, J. Mach. Learn. Res., 13, 1665-1697 (2012) · Zbl 1436.62204
[31] Onatski, A., Testing hypotheses about the number of factors in large factor models, Econometrica, 77, 1447-1479 (2009) · Zbl 1182.62180
[32] Onatski, A., Determining the number of factors from empirical distribution of eigenvalues, Rev. Econ. Stat., 92, 1004-1016 (2010)
[33] Onatski, A., Asymptotics of the principal components estimator of large factor models with weakly influential factors, J. Econometrics, 168, 244-258 (2012) · Zbl 1443.62497
[34] Pinheiro, M.; Rua, A.; Dias, F., Dynamic factor models with jagged edge panel data: Taking on board the dynamics of the idiosyncratic components, Oxford Bull. Econ. Stat., 75, 80-102 (2013)
[35] Sargent, T. J.; Sims, C., Business cycle modelling without pretending to have too much a-priori economic theory, (Sims, C., New Methods in Business Cycle Research (1977), Federal Reserve Bank of Minneapolis: Federal Reserve Bank of Minneapolis Minneapolis), 45-109
[36] Schumacher, C.; Breitung, J., Real-time forecasting of german GDP based on a large factor model with monthly and quarterly data, Int. J. Forecast., 24, 386-398 (2008)
[37] Stock, J. H.; Watson, M. W., Diffusion IndexesWorking paper 6702 (1998), National Bureau of Economic Research
[38] Stock, J. H.; Watson, M. W., Macroeconomic forecasting using diffusion indexes, J. Bus. Econom. Statist., 20, 147-162 (2002)
[39] Stock, J.; Watson, M., Dynamic factor models, factor-augmented vector autoregressions, and structural vector autoregressions in macroeconomics, (Handbook of Macroeconomics (2016)), 415-525
[40] Su, L.; Chen, Q., Testing homogeneity in panel data models with interactive fixed effects, Econometric Theory, 29, 1079-1135 (2013) · Zbl 1290.62088
[41] Su, L.; Jin, S.; Zhang, Y., Specification test for panel data models with interactive fixed effects, J. Econometrics, 186, 222-244 (2015) · Zbl 1331.62485
[42] Su, L.; Wang, X., On time-varying factor models: estimation and testing, J. Econometrics, 198, 84-101 (2017) · Zbl 1456.62220
[43] Zeng, X.; Xia, Y.; Zhang, L., Double cross validation for the number of factors in approximate factor models (2019), arXiv preprint arXiv:1907.01670
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.