# zbMATH — the first resource for mathematics

Fixed support positive-definite modification of covariance matrix estimators via linear shrinkage. (English) Zbl 1417.62141
Summary: This paper is concerned with the positive definiteness (PDness) problem in covariance matrix estimation. For high-dimensional data, many regularized estimators have been proposed under structural assumptions on the true covariance matrix, including sparsity. They were shown to be asymptotically consistent and rate-optimal in estimating the true covariance matrix and its structure. However, many of them do not take into account the PDness of the estimator and produce a non-PD estimate. To achieve PDness, researchers considered additional regularizations (or constraints) on eigenvalues, which make both the asymptotic analysis and computation much harder. In this paper, we propose a simple modification of the regularized covariance matrix estimator to make it PD while preserving the support. We revisit the idea of linear shrinkage and propose to take a convex combination between the first-stage estimator (the regularized covariance matrix without PDness) and a given form of diagonal matrix. The proposed modification, which we call the FSPD (Fixed Support and Positive Definiteness) estimator, is shown to preserve the asymptotic properties of the first-stage estimator if the shrinkage parameters are carefully selected. It has a closed form expression and its computation is optimization-free, unlike existing PD sparse estimators. In addition, the FSPD is generic in the sense that it can be applied to any non-PD matrix, including the precision matrix. The FSPD estimator is numerically compared with other sparse PD estimators to understand its finite-sample properties as well as its computational gain. It is also applied to two multivariate procedures relying on the covariance matrix estimator – the linear minimax classification problem and the Markowitz portfolio optimization problem – and is shown to improve substantially the performance of both procedures.

##### MSC:
 62H12 Estimation in multivariate analysis 62F12 Asymptotic properties of parametric estimators 62H30 Classification and discrimination; cluster analysis (statistical aspects) 62P20 Applications of statistics to economics
##### Software:
EIGIFP; eigs; glasso; IRAM; spcov
Full Text:
##### References:
 [1] Bickel, P. J.; Levina, E., Covariance regularization by thresholding, Ann. Statist., 36, 2577-2604, (2008), URL http://projecteuclid.org/euclid.aos/1231165180. http://dx.doi.org/10.1214/08-AOS600 · Zbl 1196.62062 [2] Bickel, P. J.; Levina, E., Regularized estimation of large covariance matrices, Ann. Statist., 36, 199-227, (2008), URL http://projecteuclid.org/euclid.aos/1201877299. http://dx.doi.org/10.1214/009053607000000758 · Zbl 1132.62040 [3] Bien, J.; Tibshirani, R. J., Sparse estimation of a covariance matrix, Biometrika, 98, 807-820, (2011) · Zbl 1228.62063 [4] Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J., Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn., 3, 1-122, (2010), URL https://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=8186925. http://dx.doi.org/10.1561/2200000016 · Zbl 1229.90122 [5] Cai, T.; Liu, W., Adaptive thresholding for sparse covariance matrix estimation, J. Amer. Stat. Assoc., 106, 672-684, (2011) · Zbl 1232.62086 [6] Cai, T.; Liu, W.; Luo, X., A constrained 11 minimization approach to sparse precision matrix estimation, J. Amer. Stat. Assoc., 106, 594-607, (2011), http://dx.doi.org/10.1198/jasa.2011.tm10155. arXiv:1102.2233 · Zbl 1232.62087 [7] Cai, T. T.; Low, M., A framework for estimation of convex functions, Statist. Sinica, 25, 423-456, (2015), URL http://www3.stat.sinica.edu.tw/statistica/J25N2/J25N21/J25N21.html. http://dx.doi.org/10.5705/ss.2013.279 · Zbl 06503803 [8] Cai, T. T.; Ren, Z.; Zhou, H. H., Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation, Electron. J. Stat., 10, 1-59, (2016), URL http://projecteuclid.org/euclid.ejs/1455715952. http://dx.doi.org/10.1214/15-EJS1081 · Zbl 1331.62272 [9] Cai, T. T.; Yuan, M., Adaptive covariance matrix estimation through block thresholding, Ann. Statist., 40, 2014-2042, (2012), URL http://projecteuclid.org/euclid.aos/1351602535. http://dx.doi.org/10.1214/12-AOS999 · Zbl 1257.62060 [10] Cai, T. T.; Zhang, C.-H.; Zhou, H. H., Optimal rates of convergence for covariance matrix estimation, Ann. Statist., 38, 2118-2144, (2010), URL http://projecteuclid.org/euclid.aos/1278861244. http://dx.doi.org/10.1214/09-AOS752 · Zbl 1202.62073 [11] Cai, T. T.; Zhou, H. H., Optimal rates of convergence for sparse covariance matrix estimation, Ann. Statist., 40, 2389-2420, (2012), URL http://projecteuclid.org/euclid.aos/1359987525. http://dx.doi.org/10.1214/12-AOS998 · Zbl 1373.62247 [12] Chan, L., On portfolio optimization: forecasting covariances and choosing the risk model, Rev. Financial Stud., 12, 937-974, (1999) [13] Demmel, J. W., Applied Numerical Linear Algebra, (1997), SIAM: SIAM Philadelphia, PA · Zbl 0879.65017 [14] Fan, J.; Liao, Y.; Mincheva, M., Large covariance estimation by thresholding principal orthogonal complements, J. R. Stat. Soc. Series B Stat. Methodol., 75, 603-680, (2013) [15] Friedman, J.; Hastie, T.; Tibshirani, R., Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 432-441, (2008), URL https://academic.oup.com/biostatistics/article/9/3/432/224260. http://dx.doi.org/10.1093/biostatistics/kxm045 · Zbl 1143.62076 [16] Friedman, J.; Hastie, T.; Tibshirani, R., Applications of the lasso and grouped lasso to the estimation of sparse graphical models, (2010), Stanford University: Stanford University Stanford, CA, URL statweb.stanford.edu/ tibs/ftp/ggraph.pdf [17] Golub, G. H.; Van Loan, C. F., Matrix Computations, (2012), Johns Hopkins University Press: Johns Hopkins University Press Baltimore, MD [18] Golub, G. H.; Ye, Q., An inverse free preconditioned Krylov subspace method for symmetric generalized eigenvalue problems, SIAM J. Sci. Comput., 24, 312-334, (2002), URL https://doi.org/10.1137/S1064827500382579. http://dx.doi.org/10.1137/S1064827500382579 · Zbl 1016.65017 [19] Jagannathan, R.; Ma, T., Risk reduction in large portfolios: Why imposing the wrong constraints helps, J. Finance, 58, 1651-1683, (2003) [20] Khare, K.; Oh, S.-Y.; Rajaratnam, B., A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees, J. R. Stat. Soc. Series B Stat. Methodol., 77, 803-825, (2015), http://dx.doi.org/10.1111/rssb.12088. arXiv:1307.5381 [21] Kwon, Y.; Choi, Y.-G.; Park, T.; Ziegler, A.; Paik, M. C., Generalized estimating equations with stabilized working correlation structure, Comput. Statist. Data Anal., 106, 1-11, (2017) · Zbl 06917856 [22] Lam, C.; Fan, J., Sparsistency and rates of convergence in large covariance matrix estimation, Ann. Statist., 37, 4254-4278, (2009), URL https://projecteuclid.org/euclid.aos/1256303543. http://dx.doi.org/10.1214/09-AOS720 · Zbl 1191.62101 [23] Lanckriet, G. R.; El Ghaoui, L.; Bhattacharyya, C.; Jordan, M. I., A robust minimax approach to classification, J. Mach. Learn Res., 3, 555-582, (2002), URL http://www.jmlr.org/papers/volume3/lanckriet02a/lanckriet02a.pdf. http://dx.doi.org/10.1162/153244303321897726 · Zbl 1084.68657 [24] Ledoit, O.; Wolf, M., A well-conditioned estimator for large-dimensional covariance matrices, J. Multivariate Anal., 88, 365-411, (2004) · Zbl 1032.62050 [25] Lehoucq, R. B.; Sorensen, D. C., Deflation techniques for an implicitly restarted arnoldi iteration, SIAM J. Matrix Anal. Appl., 17, 789-821, (1996) · Zbl 0863.65016 [26] Little, M.; McSharry, P.; Hunter, E.; Spielman, J.; Ramig, L., Suitability of dysphonia measurements for telemonitoring of parkinson’s disease, IEEE Trans. Biomed. Eng., 56, 1015-1022, (2009), URL http://ieeexplore.ieee.org/document/4636708/. http://dx.doi.org/10.1109/TBME.2008.2005954 [27] Liu, H.; Wang, L.; Zhao, T., Sparse covariance matrix estimation with eigenvalue constraints, J. Comput. Graph. Stat., 23, 439-459, (2014) [28] Marcenko, V. A.; Pastur, L. A., Distribution of eigenvalues for some sets of random matrices, Math. USSR-Sbornik, 1, 457-483, (1967) · Zbl 0162.22501 [29] Mazumder, R.; Hastie, T., The graphical lasso: New insights and alternatives, Electron. J. Stat., 6, 2125-2149, (2012), URL https://projecteuclid.org/euclid.ejs/1352470831. http://dx.doi.org/10.1214/12-EJS740. arXiv:1111.5479 · Zbl 1295.62066 [30] Meinshausen, N.; Bühlmann, P., High-dimensional graphs and variable selection with the Lasso, Ann. Statist., 34, 1436-1462, (2006), URL http://projecteuclid.org/euclid.aos/1152540754. http://dx.doi.org/10.1214/009053606000000281. arXiv:0608017 · Zbl 1113.62082 [31] Peng, J.; Wang, P.; Zhou, N.; Zhu, J., Partial correlation estimation by joint sparse regression models, J. Amer. Stat. Assoc., 104, 735-746, (2009) · Zbl 1388.62046 [32] Rothman, A. J., Positive definite estimators of large covariance matrices, Biometrika, 99, 733-740, (2012), URL https://academic.oup.com/biomet/article-abstract/99/3/733/359725. http://dx.doi.org/10.1093/biomet/ass025 · Zbl 1437.62595 [33] Rothman, A. J.; Levina, E.; Zhu, J., Generalized thresholding of large covariance matrices, J. Amer. Stat. Assoc., 104, 177-186, (2009) · Zbl 1388.62170 [34] Sorensen, D. C., Implicit application of polynomial filters in a k-step Arnoldi method, SIAM J. Matrix Anal. Appl., 13, 357-385, (1990) · Zbl 0763.65025 [35] Sun, Y.; Todorovic, S.; Goodison, S., Local-learning-based feature selection for high-dimensional data analysis, IEEE Trans. Pattern. Anal. Mach. Intell., 32, 1610-1626, (2010) [36] Tsanas, A.; a. Little, M.; Fox, C.; Ramig, L. O., Objective automatic assessment of rehabilitative speech treatment in Parkinson’s disease, IEEE Trans. Neural. Syst. Rehabil. Eng., 22, 181-190, (2014), URL http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6678640. http://dx.doi.org/10.1109/TNSRE.2013.2293575 [37] Witten, D. M.; Friedman, J. H.; Simon, N., New insights and faster computations for the graphical lasso, J. Comput. Graph. Stat., 20, 892-900, (2011) [38] Won, J.-H.; Lim, J.; Kim, S.-J.; Rajaratnam, B., Condition number regularized covariance estimation, J. R. Stat. Soc. Series B Stat. Methodol., 75, 427-450, (2013) [39] Xue, L.; Ma, S.; Zou, H., Positive-definite 11-penalized estimation of large covariance matrices, J. Amer. Stat. Assoc., 107, 1480-1491, (2012) · Zbl 1258.62063
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.