×

zbMATH — the first resource for mathematics

Bayesian regularization of Gaussian graphical models with measurement error. (English) Zbl 07345810
Summary: A framework for determining and estimating the conditional pairwise relationships of variables in high dimensional settings when the observed samples are contaminated with measurement error is proposed. The framework is motivated by the task of establishing gene regulatory networks from microarray studies, in which measurements are taken for a large number of genes from a small sample size, but often measured imperfectly. When no measurement error is present, this problem is often solved by estimating the precision matrix under sparsity constraints. However, when measurement error is present, not correcting for it leads to inconsistent estimates of the precision matrix and poor identification of relationships. To this end, a recent iterative imputation technique developed in the context of missing data is utilized to correct for the biases in the estimates imposed from the contamination. This technique is showcased with a recent variant of the spike-and-slab Lasso to obtain a point estimate of the precision matrix. Simulation studies show that the new method outperforms the naïve method that ignores measurement error in both identification and estimation accuracy. The new method is applied to establish a conditional gene network from a microarray dataset.
MSC:
62-XX Statistics
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Cai, Tony; Liu, Weidong; Luo, Xi, A constrained l1 minimization approach to sparse precision matrix estimation, J. Amer. Statist. Assoc., 106, 494, 594-607 (2011) · Zbl 1232.62087
[2] Carroll, Raymond J.; Ruppert, David; Crainiceanu, Ciprian M.; Stefanski, Leonard A., Measurement Error in Nonlinear Models: a Modern Perspective (2006), Chapman and Hall/CRC · Zbl 1119.62063
[3] Dempster, Arthur P., Covariance selection, Biometrics, 157-175 (1972)
[4] Deshpande, Sameer K.; Rockova, Veronika; George, Edward I., Simultaneous variable and covariance selection with the multivariate spike-and-slab lasso (2017), arXiv preprint arXiv:1708.08911
[5] Friedman, Jerome; Hastie, Trevor; Tibshirani, Robert, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 3, 432-441 (2008) · Zbl 1143.62076
[6] Gan, Lingrui; Narisetty, Naveen N.; Liang, Feng, Bayesian regularization for graphical models with unequal shrinkage, J. Amer. Statist. Assoc., 1-14 (2018) · Zbl 1428.62225
[7] Huang, Chiang-Ching; Gadd, Samantha; Breslow, Norman; Cutcliffe, Colleen; Sredni, Simone T.; Helenowski, Irene B.; Dome, Jeffrey S.; Grundy, Paul E.; Green, Daniel M.; Fritsch, Michael K., Predicting relapse in favorable histology wilms tumor using gene expression analysis: a report from the renal tumor committee of the children’s oncology group, Clin. Cancer Res., 15, 5, 1770-1778 (2009)
[8] Johnstone, Iain M., On the distribution of the largest eigenvalue in principal components analysis, Ann. Statist., 29, 2, 295-327 (2001) · Zbl 1016.62078
[9] Khare, Kshitij; Oh, Sang-Yun; Rajaratnam, Bala, A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees, J. R. Stat. Soc. Ser. B Stat. Methodol., 77, 4, 803-825 (2015) · Zbl 1414.62183
[10] Krämer, Nicole; Schäfer, Juliane; Boulesteix, Anne-Laure, Regularized estimation of large-scale gene association networks using graphical Gaussian models, BMC Bioinformatics, 10, 1, 384 (2009)
[11] Lauritzen, Steffen L., Graphical Models, Vol. 17 (1996), Clarendon Press · Zbl 0907.62001
[12] Liang, Faming; Jia, Bochao; Xue, Jingnan; Li, Qizhai; Luo, Ye, An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond, J. R. Stat. Soc. Ser. B Stat. Methodol., 80, 5, 899-926 (2018) · Zbl 1407.62258
[13] Liu, Han; Wang, Lie, Tiger: A tuning-insensitive approach for optimally estimating gaussian graphical models, Electron. J. Stat., 11, 1, 241-294 (2017) · Zbl 1395.62007
[14] Nghiem, Linh; Potgieter, Cornelis, Simulation-selection-extrapolation: Estimation in high-dimensional errors-in-variables models (2018), arXiv preprint arXiv:1808.10477 · Zbl 1448.62180
[15] O’Leary, Nuala A.; Wright, Mathew W.; Brister, J. Rodney; Ciufo, Stacy; Haddad, Diana; McVeigh, Rich; Rajput, Bhanu; Robbertse, Barbara; Smith-White, Brian; Ako-Adjei, Danso, Reference sequence (refseq) database at NCBI: current status, taxonomic expansion, and functional annotation, Nucleic Acids Res., 44, D1, D733-D745 (2015)
[16] Petersen, Kaare Brandt; Pedersen, Michael Syskind, The matrix cookbook, Tech. Univ. Denmark, 7, 15, 510 (2008)
[17] Rocke, David M.; Durbin, Blythe, A model for measurement error for gene expression arrays, J. Comput. Biol., 8, 6, 557-569 (2001)
[18] Ročková, Veronika; George, Edward I., The spike-and-slab lasso, J. Amer. Statist. Assoc., 113, 521, 431-444 (2018) · Zbl 1398.62186
[19] Ročková, Veronika, Bayesian estimation of sparse signals with a continuous spike-and-slab prior, Ann. Statist., 46, 1, 401-437 (2018) · Zbl 1395.62230
[20] Segal, Eran; Shapira, Michael; Regev, Aviv; Pe’er, Dana; Botstein, David; Koller, Daphne; Friedman, Nir, Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data, Nat. Genet., 34, 2, 166 (2003)
[21] Sørensen, Øystein; Frigessi, Arnoldo; Thoresen, Magne, Measurement error in lasso: Impact and likelihood bias correction, Statist. Sinica, 809-829 (2015) · Zbl 06503822
[22] Tan, Kean Ming; Ning, Yang; Witten, Daniela M.; Liu, Han, Replicates in high dimensions, with applications to latent variable graphical models, Biometrika, 103, 4, 761-777 (2016) · Zbl 07072154
[23] Turro, Ernest; Bochkina, Natalia; Hein, Anne-Mette K.; Richardson, Sylvia, BGX: a Bioconductor package for the bayesian integrated analysis of affymetrix genechips, BMC Bioinformatics, 8, 1, 439 (2007)
[24] Yuan, Ming; Lin, Yi, Model selection and estimation in the Gaussian graphical model, Biometrika, 94, 1, 19-35 (2007) · Zbl 1142.62408
[25] Zakharkin, Stanislav O.; Kim, Kyoungmi; Mehta, Tapan; Chen, Lang; Barnes, Stephen; Scheirer, Katherine E.; Parrish, Rudolph S.; Allison, David B.; Page, Grier P., Sources of variation in affymetrix microarray experiments, BMC Bioinform., 6, 1, 214 (2005)
[26] Zhao, Tuo; Liu, Han; Roeder, Kathryn; Lafferty, John; Wasserman, Larry, The huge package for high-dimensional undirected graph estimation in R, J. Mach. Learn. Res., 13, Apr, 1059-1062 (2012) · Zbl 1283.68311
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.