×

Promote sign consistency in the joint estimation of precision matrices. (English) Zbl 1510.62243

Summary: The Gaussian graphical model is a popular tool for inferring the relationships among random variables, where the precision matrix provides a natural interpretation of conditional independence. With high-dimensional data, sparsity of the precision matrix is often assumed, and various regularization methods have been applied for estimation. In several scenarios, it is desirable to conduct the joint estimation of multiple precision matrices. In joint estimation, entries corresponding to the same element of multiple precision matrices form a group, and group regularization methods have been applied for the estimation and identification of sparsity structures. In many practical examples, it can be difficult to interpret the results when parameters within the same group have conflicting signs. Unfortunately, existing methods lack an explicit mechanism in regards to sign consistency of group parameters. To tackle this problem, a novel regularization method is developed for the joint estimation of multiple precision matrices. It effectively enhances the sign consistency of group parameters and hence can lead to more interpretable results, while still allowing for conflicting signs to achieve full flexibility. The method’s consistency properties are rigorously established. Simulations show that the proposed method outperforms competing alternatives under a variety of settings. For the two data examples, the proposed approach leads to interpretable results that are different from the alternatives.

MSC:

62H12 Estimation in multivariate analysis
62H22 Probabilistic graphical models
62P10 Applications of statistics to biology and medical sciences; meta analysis
62-08 Computational methods for problems pertaining to statistics

Software:

glasso
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Barabási, A.-L.; Albert, R., Emergence of scaling in random networks, Science, 286, 509-512 (1999) · Zbl 1226.05223
[2] Bickel, P. J.; Elizaveta, L., Regularized estimation of large covariance matrices, Ann. Statist., 36, 199-227 (2008) · Zbl 1132.62040
[3] Bilgrau, A. E.; Peeters, C. F.; Eriksen, P. S.; Bøgsted, M.; van Wieringen, W. N., Targeted fused ridge estimation of inverse covariance matrices from multiple high-dimensional data classes, J. Mach. Learn. Res., 21, 1-52 (2020) · Zbl 1499.62177
[4] Breheny, P.; Huang, J., Penalized methods for bi-level variable selection, Stat. Interface, 2, 369-380 (2009) · Zbl 1245.62034
[5] Cai, T.; Li, H.; Liu, W.; Xie, J., Joint estimation of multiple high-dimensional precision matrices, Statist. Sinica, 26, 445-464 (2016) · Zbl 1356.62066
[6] Cai, T.; Liu, W.; Luo, X., A constrained \(\ell_1\) minimization approach to sparse precision matrix estimation, J. Amer. Statist. Assoc., 106, 594-607 (2011) · Zbl 1232.62087
[7] Cheng, X.; Lu, W.; Liu, M., Identification of homogeneous and heterogeneous variables in pooled cohort studies, Biometrics, 71, 397-403 (2015) · Zbl 1390.62242
[8] Chiquet, J.; Grandvalet, Y.; Ambroise, C., Inferring multiple graphical structures, Stat. Comput., 21, 537-553 (2011) · Zbl 1221.62085
[9] Danaher, P.; Wang, P.; Witten, D. M., The joint graphical lasso for inverse covariance estimation across multiple classes, J. R. Stat. Soc. Ser. B Stat. Methodol., 76, 373-397 (2014) · Zbl 07555455
[10] Dicker, L.; Huang, B.; Lin, X., Variable selection and estimation with the seamless-\( L_0\) penalty, Statist. Sinica, 23, 929-962 (2013) · Zbl 1433.62068
[11] Emura, T.; Nakatochi, M.; Matsui, S.; Michimae, H.; Rondeau, V., Personalized dynamic prediction of death according to tumour progression and high-dimensional genetic factors: Meta-analysis with a joint model, Stat. Methods Med. Res., 27, 2842-2858 (2018)
[12] Fan, J.; Liao, Y.; Liu, H., An overview of the estimation of large covariance and precision matrices, Econom. J., 19, C1-C32 (2016) · Zbl 1521.62083
[13] Fang, K.; Fan, X.; Zhang, Q.; Ma, S., Integrative sparse principal component analysis, J. Multivariate Anal., 166, 1-16 (2018) · Zbl 1499.62198
[14] Friedman, N., Inferring cellular networks using probabilistic graphical models, Science, 303, 799-805 (2004)
[15] Friedman, J.; Hastie, T.; Tibshirani, R., Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 432-441 (2008) · Zbl 1143.62076
[16] Guo, J.; Levina, E.; Michailidis, G.; Zhu, J., Joint estimation of multiple graphical models, Biometrika, 98, 1-15 (2011) · Zbl 1214.62058
[17] Huang, Y.; Zhang, Q.; Zhang, S.; Huang, J.; Ma, S., Promoting similarity of sparsity structures in integrative analysis with penalization, J. Amer. Statist. Assoc., 112, 342-350 (2017)
[18] Lam, C.; Fan, J., Sparsistency and rates of convergence in large covariance matrix estimation, Ann. Statist., 37, 4254-4278 (2009) · Zbl 1191.62101
[19] Liu, J.; Huang, J.; Ma, S., Integrative analysis of multiple cancer genomic datasets under the heterogeneity model, Stat. Med., 32, 3509-3521 (2013)
[20] Liu, J.; Huang, J.; Ma, S., Integrative analysis of cancer diagnosis studies with composite penalization, Scand. J. Stat., 41, 87-103 (2014) · Zbl 1349.62552
[21] Ma, S.; Huang, J.; Song, X., Integrative analysis and variable selection with multiple high-dimensional data sets, Biostatistics, 12, 763-775 (2011) · Zbl 1314.62243
[22] Ma, S.; Zhang, Y.; Huang, J.; Huang, Y.; Lan, Q.; Rothman, N.; Zheng, T., Integrative analysis of cancer prognosis data with multiple subtypes using regularized gradient descent, Genet. Epidemiol., 36, 829-838 (2012)
[23] Meinshausen, N.; Bühlmann, P., High-dimensional graphs and variable selection with the lasso, Ann. Statist., 34, 1436-1462 (2006) · Zbl 1113.62082
[24] Riley, R. D.; Lambert, P. C.; Abo-Zaid, G., Meta-analysis of individual participant data: Rationale, conduct, and reporting, BMJ, 340, c221 (2010)
[25] Rothman, A. J.; Bickel, P. J.; Levina, E.; Zhu, J., Sparse permutation invariant covariance estimation, Electron. J. Stat., 2, 494-515 (2008) · Zbl 1320.62135
[26] Saegusa, T.; Shojaie, A., Joint estimation of precision matrices in heterogeneous populations, Electron. J. Stat., 10, 1341-1392 (2016) · Zbl 1341.62130
[27] Scheinberg, K.; Ma, S.; Goldfarb, D., Sparse inverse covariance selection via alternating linearization methods, (Advances in Neural Information Processing Systems (2010)), 2101-2109
[28] Shedden, K.; Taylor, J. M.; Enkemann, S. A.; Tsao, M.-S.; Yeatman, T. J.; Gerald, W. L.; Eschrich, S.; Jurisica, I.; Giordano, T. J.; Misek, D. E., Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study, Nature Med., 14, 822-827 (2008)
[29] Siegel, R. L.; Miller, K. D.; Jemal, A., Cancer statistics, 2019, CA Cancer J. Clin., 69, 7-34 (2019)
[30] Tang, L.; Song, P. X., Fused lasso approach in regression coefficients clustering: learning parameter heterogeneity in data integration, J. Mach. Learn. Res., 17, 3915-3937 (2016) · Zbl 1368.62209
[31] Tang, X.; Xue, F.; Qu, A., Individualized multidirectional variable selection, J. Amer. Statist. Assoc. (2020), Preprint
[32] Wainwright, M. J.; Jordan, M. I., Graphical models, exponential families, and variational inference, Found. Trends Mach. Learn., 1, 1-305 (2008) · Zbl 1193.62107
[33] Waldron, L.; Haibe-Kains, B.; Culhane, A. C.; Riester, M.; Ding, J.; Wang, X. V.; Ahmadifar, M.; Tyekucheva, S.; Bernau, C.; Risch, T., Comparative meta-analysis of prognostic gene signatures for late-stage ovarian cancer, J. Natl. Cancer Inst., 106 (2014)
[34] Wang, H.; Li, B.; Leng, C., Shrinkage tuning parameter selection with a diverging number of parameters, J. R. Stat. Soc. Ser. B Stat. Methodol., 71, 671-683 (2009) · Zbl 1250.62036
[35] Wang, H.; Li, R.; Tsai, C.-L., Tuning parameter selectors for the smoothly clipped absolute deviation method, Biometrika, 94, 553-568 (2007) · Zbl 1135.62058
[36] van Wieringen, W. N.; Stam, K. A.; Peeters, C. F.; van de Wiel, M. A., Updating of the gaussian graphical model through targeted penalized estimation, J. Multivariate Anal., Article 104621 pp. (2020) · Zbl 1440.62211
[37] Wu, Y.; Wang, L., A survey of tuning parameter selection for high-dimensional regression, Ann. Rev. Stat. Appl., 7, 209-226 (2020)
[38] Yan, W.; Xu, N.; Han, X.; Zhou, X.-M.; He, B., The clinicopathological significance of fhit hypermethylation in non-small cell lung cancer, a meta-analysis and literature review, Sci. Rep., 6, Article 19303 pp. (2016)
[39] Yuan, M., High dimensional inverse covariance matrix estimation via linear programming, J. Mach. Learn. Res., 11, 2261-2286 (2010) · Zbl 1242.62043
[40] Zhang, C., Nearly unbiased variable selection under minimax concave penalty, Ann. Statist., 38, 894-942 (2010) · Zbl 1183.62120
[41] Zhang, Q.; Zhang, S.; Liu, J.; Huang, J.; Ma, S., Penalized integrative analysis under the accelerated failure time model, Statist. Sinica, 26, 493-508 (2016) · Zbl 1356.62176
[42] Zhang, T.; Zou, H., Sparse precision matrix estimation via lasso penalized d-trace loss, Biometrika, 101, 103-120 (2014) · Zbl 1285.62063
[43] Zhao, Q.; Shi, X.; Huang, J.; Liu, J.; Li, Y.; Ma, S., Integrative analysis of ‘-omics’ data using penalty functions, Wiley Interdiscip. Rev. Comput. Stat., 7, 99-108 (2015)
[44] Zhu, Y.; Shen, X.; Pan, W., Structural pursuit over multiple undirected graphs, J. Amer. Statist. Assoc., 109, 1683-1696 (2014) · Zbl 1368.62181
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.