Incorporating grouping information in Bayesian variable selection with applications in genomics. (English) Zbl 1327.62156

Summary: In many applications it is of interest to determine a limited number of important explanatory factors (representing groups of potentially overlapping predictors) rather than original predictor variables. The often imposed requirement that the clustered predictors should enter the model simultaneously may be limiting as not all the variables within a group need to be associated with the outcome. Within-group sparsity is often desirable as well. Here we propose a Bayesian variable selection method, which uses the grouping information as a means of introducing more equal competition to enter the model within the groups rather than as a source of strict regularization constraints. This is achieved within the context of Bayesian LASSO (least absolute shrinkage and selection operator) by allowing each regression coefficient to be penalized differentially and by considering an additional regression layer to relate individual penalty parameters to a group identification matrix. The proposed hierarchical model therefore enables inference simultaneously on two levels: (1) the regression layer for the continuous outcome in relation to the predictors and (2) the regression layer for the penalty parameters in relation to the grouping information. Both situations with overlapping and non-overlapping groups are applicable. The method does not assume within-group homogeneity across the regression coefficients, which is implicit in many structured penalized likelihood approaches. The smoothness here is enforced at the penalty level rather than within the regression coefficients. To enhance the potential of the proposed method we develop two rapid computational procedures based on the expectation maximization (EM) algorithm, which offer substantial time savings in applications where the high-dimensionality renders Markov chain Monte Carlo (MCMC) approaches less practical. We demonstrate the usefulness of our method in predicting time to death in glioblastoma patients using pathways of genes.


62F15 Bayesian inference
62J07 Ridge regression; shrinkage estimators (Lasso)
62P10 Applications of statistics to biology and medical sciences; meta analysis


Full Text: DOI Euclid


[1] Abramowitz, M. and Stegun, I. (1972). Handbook of Mathematical Functions . Dover Publications, 1 edition. · Zbl 0543.33001
[2] Armagan, A., Dunson, D., and Lee, J. (2012). “Generalized Double Pareto Shrinkage.” Technical report, Duke University. · Zbl 1259.62061 · doi:10.5705/ss.2011.048
[3] Carvalho, C. and Polson, N. (2010). “The Horseshoe Estimator for Sparse Signals.” Biometrika , 97(476): 465-480. · Zbl 1406.62021 · doi:10.1093/biomet/asq017
[4] Carvalho, C. M., Chang, J., Lucas, J. E., Nevins, J. R., Wang, Q., and West, M. (2008). “High-Dimensional Sparse Factor Modelling: Applications in Gene Expression Genomics.” Journal of the American Statistical Association , 103(484): 1438-1456. · Zbl 1286.62091 · doi:10.1198/016214508000000869
[5] Chen, M.-H. and Ibrahim, J. G. (2003). “Conjugate priors for generalized linear models.” Statistica Sinica , 13(2): 461-476. · Zbl 1015.62074
[6] Choe, G., Horvath, S., Cloughesy, T., Crosby, K., Seligson, D., Palotie, A., Inge, L., Smith, B., Sawyers, C., and Mischel, P. (2003). “Analysis of the phosphatidylinositol 3’-kinase signaling pathway in glioblastoma patients in vivo.” Cancer Research , 63(2): 2742-2746.
[7] Dempster, A., Laird, N., and Rubin, D. (1977). “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society, Series B , 39(1): 1-38. · Zbl 0364.62022
[8] Dickinson, R., Dallol, A., Bieche, I., Krex, D., Morton, D., Maher, E., and Latif, F. (2004). “Epigenetic inactivation of SLIT3 and SLIT1 genes in human cancers.” British Journal of Cancer , 13: 2071-2078.
[9] Fan, J. and Li, R. (2001). “Variable Selection Via Nonconcave Penalized Likelihood and Its Oracle Properties.” Journal of the American Statistical Association , 96: 1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[10] Figueiredo, M. A. (2003). “Adaptive Sparseness for Supervised Learning.” IEEE Transactions on Pattern Analysis and Machine Intelligence , 25: 1150-1159.
[11] Gelfand, A. and Vounatsou, P. (2003). “Proper Multivariate Conditional Autoregressive Models for Spatial Data Analysis.” Biostatistics , 4: 11-15. · Zbl 1142.62393 · doi:10.1093/biostatistics/4.1.11
[12] George, E. and Foster, D. (1997). “Calibration and Empirical Bayes Variable Selection.” Biometrika , 87: 731-747. · Zbl 1029.62008 · doi:10.1093/biomet/87.4.731
[13] Gingras, M., Roussel, E., Bruner, J., Branch, C., and Moser, R. (1995). “Comparison of cell adhesion molecule expression between glioblastoma multiforme and autologous normal brain tissue.” Journal of Neuroimmunology , 57: 143-153.
[14] Golub, G. and van Loan, C. (1996). Matrix Computations . The John Hopkins University Press, 1 edition. · Zbl 0865.65009
[15] Gradshteyn, I. and Ryzhik, E. (2000). Table of Integrals Series and Products . Academic Press, 6 edition. · Zbl 0981.65001
[16] Griffin, J. E. and Brown, P. J. (2012). “Bayesian Hyper-LASSOS with Non-convex Penalization.” Australian & New Zealand Journal of Statistics , 53: 423-442. · Zbl 1335.62047 · doi:10.1111/j.1467-842X.2011.00641.x
[17] Horvath, S., Zhang, B., Carlson, M., Lu, K., Zhu, S., Felciano, R., Laurance, M., Zhao, W., Qi, S., Chen, Z., Lee, Y., Scheck, A., Liau, L., Wu, H., Geschwind, D., Febbo, P., Kornblum, H., Cloughesy, T., Nelson, S., and Mischel, P. (2006). “Analysis of Oncogenic Signaling Networks in Glioblastoma Identifies ASPM as a Molecular Target.” Proceedings of the National Academy of Sciences of the United States of America , 103: 17402-17407.
[18] Irizarry, R., Hobbs, B., Collin, F., Beazer-Barclay, Y., Antonellis, K., Scherf, U., and Speed, T. (2003). “Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data.” Biostatistics , 4: 249-264. · Zbl 1141.62348 · doi:10.1093/biostatistics/4.2.249
[19] Ishwaran, H. and Rao, S. (2005). “Spike and slab variable selection: frequentist and Bayesian strategies.” The Annals of Statistiscs , 33: 730-773. · Zbl 1068.62079 · doi:10.1214/009053604000001147
[20] Jacob, L., Obozinski, G., and Vert, J. (2009). “Group LASSO with Overlap and Graph LASSO.” Proceedings of the 26th International Conference on Machine Learning , 55: 1-8.
[21] Kanehisa, M., Goto, S., Kawashima, S., and Nakaya, A. (2002). “The KEGG Databases at GenomeNet.” Nucleic Acids Research , 30: 42-46.
[22] Kiiveri, H. (2003). “A Bayesian Approach to Variable Selection When the Number of Variables is Very Large.” Institute of Mathematical Statistics Lecture Notes-Monograph Series , 40: 127-143. · doi:10.1214/lnms/1215091139
[23] Kyung, M., Gilly, J., Ghosh, M., and Casella, G. (2010). “Penalized Regression, Standard Errors, and Bayesian Lassos.” Bayesian Analysis , 5: 369-412. · Zbl 1330.62289 · doi:10.1214/10-BA607
[24] Leeb, H. and Potscher, B. M. (2005). “Model Selection and Inference: Facts and Fiction.” Econometric Theory , 21: 21-59. · Zbl 1085.62004 · doi:10.1017/S0266466605050036
[25] Li, C. and Li, H. (2008). “Network-constrained Regularization and Variable Selection for Analysis of Genomic Data.” Biometrics , 24(9): 1175-1182.
[26] Li, F. and Zhang, N. R. (2010). “Bayesian Variable Selection in Structured High-dimensional Covariate Spaces with Applications in Genomics.” Journal of the American Statistical Association , 105(3): 1978-2002. · Zbl 1390.62027 · doi:10.1198/jasa.2010.tm08177
[27] Liang, F., Paulo, R., Molina, G., Clyde, M., and Berger, J. (2008). “Mixtures of g-priors for Bayesian Variable Selection.” Journal of the American Statistical Association , 410-423. · Zbl 1335.62026 · doi:10.1198/016214507000001337
[28] McDonald, J., Dunmire, V., Taylor, R., E. Sawaya, Bruner, J., Fuller, G., Aldape, K., and Zhang, W. (2005). “Attenuated Expression of DFFB is a Hallmark of Oligodendrogliomas with 1p-Allelic Loss.” Molecular Cancer , 4: 1476-1498.
[29] McLachlan, G. J. and Krishnan, T. (1996). The EM Algorithm and Extensions . Wiley-Interscience, 2 edition. · Zbl 0882.62012
[30] Nakada, M., Kita, D., Watanabe, T., Hayashi, Y., Teng, L., Pyko, I., and Hamada, J. (2011). “Aberrant Signaling Pathways in Glioma.” Cancers , 3: 3242-3278.
[31] Nikuseva-Martic, T., Beros, V., Pecina-Slaus, N., Pecina, H. I., and Bulic-Jakus, F. (2010). “Genetic changes of CDH1, APC, and CTNNB1 found in human brain tumors.” Pathology - Research and Practice , 203(11): 779-787.
[32] Pan, W., Benhuai, X., and Xiaotong, S. (2010). “Incorporating Predictor Network in Penalized Regression with Application to Microarray Data.” Biometrics , 66(2): 474-484. · Zbl 1192.62235 · doi:10.1111/j.1541-0420.2009.01296.x
[33] Park, T. and Casella, G. (2008). “The Bayesian Lasso.” Journal of the American Statistical Association , 103(482): 681-686. · Zbl 1330.62292 · doi:10.1198/016214508000000337
[34] Paulus, W. and Tonn, J. (1995). “Interactions of glioma cells and extracellular matrix.” Journal of Neuro-Oncology , 24: 87-91.
[35] Peng, H. and Fan, J. (2004). “Nonconcave penalized likelihood with a diverging number of parameters.” The Annals of Statistics , 32(3): 928-961. · Zbl 1092.62031 · doi:10.1214/009053604000000256
[36] Schneider, S., Ludwig, T., Tatenhorst, L., Braune, S., Oberleithner, H., Senner, V., and Paulus, W. (2004). “Glioblastoma cells release factors that disrupt blood-brain barrier features.” Acta Neuropathologica , 107: 272-276.
[37] Sciumè, G., Soriani, A., Piccoli, M., Frati, L., Santoni, A., and Bernardini, G. (2010). “CX3CL1 axis negatively controls glioma cell invasion and is modulated by transforming growth factor-beta1.” Neuro-Oncology , 111(2): 3626-3634.
[38] Stingo, F., Chen, Y., Tadesse, M., and Vannucci, M. (2011). “Incorporating Biological Information into Linear Models: A Bayesian Approach to the Selection of Pathways and Genes.” The Annals of Applied Statistics , 5: 1202-1214. · Zbl 1228.62150 · doi:10.1214/11-AOAS463
[39] Stingo, F., Chen, Y., Vannucci, M., Barrier, M., and Mirkes, P. (2010). “A Bayesian Graphical Modeling Approach to MicroRNA Regulatory Network Inference.” Annals of Applied Statistics , 4: 2024-2048. · Zbl 1220.62142 · doi:10.1214/10-AOAS360
[40] Stingo, F. and Vannucci, M. (2011). “Variable Selection for Discriminant Analysis with Markov Random Field Priors for the Analysis of Microarray Data.” Bioinformatics , 27(4): 495-501.
[41] Tibshirani, R. (1994). “Regression Shrinkage and Selection Via the Lasso.” Journal of the Royal Statistical Society, Series B , 58: 267-288. · Zbl 0850.62538
[42] Ueda, N. and Nakano, R. (1998). “Deterministic annealing EM algorithm.” Neural Networks , 11: 271-282.
[43] Yuan, M. and Lin, Y. (2006). “Model selection and estimation in regression with grouped variables.” Journal of the Royal Statistical Society, Series B , 68: 49-67. · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[44] Zellner, A. (1986). “On assessing prior distributions and Bayesian regression analysis with g-prior distributions.” In Bayesian Inference and Decision Techniques . · Zbl 0655.62071
[45] Zou, H. (2006). “The Adaptive Lasso and Its Oracle Properties.” Journal of the American Statistical Association , 101(476): 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
[46] Zou, H. and Hastie, T. (2005). “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society, Series B , 67: 301-320. · Zbl 1069.62054 · doi:10.1111/j.1467-9868.2005.00503.x
[47] Zou, H. and Li, R. (2008). “One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.” The Annals of Statistics , 36(4): 1509-1533. · Zbl 1142.62027 · doi:10.1214/009053607000000802
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.