×

High-dimensional Gaussian graphical models on network-linked data. (English) Zbl 1498.68249

Summary: Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations are independent and identically distributed. At the same time, observations connected by a network are becoming increasingly common, and tend to violate these assumptions. Here we develop a Gaussian graphical model for observations connected by a network with potentially different mean vectors, varying smoothly over the network. We propose an efficient estimation algorithm and demonstrate its effectiveness on both simulated and real data, obtaining meaningful and interpretable results on a statistics coauthorship network. We also prove that our method estimates both the inverse covariance matrix and the corresponding graph structure correctly under the assumption of network “cohesion”, which refers to the empirically observed phenomenon of network neighbors sharing similar traits.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62H22 Probabilistic graphical models

Software:

QUIC; huge; clusterpath
PDFBibTeX XMLCite
Full Text: arXiv Link

References:

[1] Arash A Amini, Aiyou Chen, Peter J Bickel, and Elizaveta Levina.Pseudo-likelihood methods for community detection in large sparse networks.The Annals of Statistics, 41 (4):2097-2122, 2013. · Zbl 1277.62166
[2] Onureena Banerjee, Laurent El Ghaoui, and Alexandre d’Aspremont.Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Learning Research, 9(Mar):485-516, 2008. · Zbl 1225.68149
[3] Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation.Neural Computation, 15(6):1373-1396, 2003. · Zbl 1085.68119
[4] N. Binkiewicz, J. T. Vogelstein, and K. Rohe.Covariate-assisted spectral clustering. Biometrika, 104(2):361-377, 2017. doi: 10.1093/biomet/asx008. URL+http://dx.doi. org/10.1093/biomet/asx008. · Zbl 1506.62319
[5] Andries E Brouwer and Willem H Haemers.Spectra of graphs. Springer Science & Business Media, 2011.
[6] Neil A Butler. Optimal and orthogonal latin hypercube designs for computer experiments. Biometrika, pages 847-857, 2001. · Zbl 0985.62058
[7] T Tony Cai, Hongzhe Li, Weidong Liu, and Jichun Xie. Covariate-adjusted precision matrix estimation with an application in genetical genomics.Biometrika, 100(1):139-156, 2013. · Zbl 1284.62648
[8] Eric C Chi and Kenneth Lange. Splitting methods for convex clustering.Journal of Computational and Graphical Statistics, 24(4):994-1013, 2015.
[9] Nicholas A Christakis and James H Fowler. The spread of obesity in a large social network over 32 years.New England Journal of Medicine, 357(4):370-379, 2007.
[10] Michael B Cohen, Rasmus Kyng, Gary L Miller, Jakub W Pachocki, Richard Peng, Anup B Rao, and Shen Chen Xu. Solving sdd linear systems in nearly m log 1/2 n time. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pages 343- 352. ACM, 2014. · Zbl 1315.65026
[11] Patrick Danaher, Pei Wang, and Daniela M Witten. The joint graphical lasso for inverse covariance estimation across multiple classes.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(2):373-397, 2014. · Zbl 07555455
[12] Alexandre d’Aspremont, Onureena Banerjee, and Laurent El Ghaoui. First-order methods for sparse covariance selection.SIAM Journal on Matrix Analysis and Applications, 30 (1):56-66, 2008. · Zbl 1156.90423
[13] Thomas Edwards. The discrete laplacian of a rectangular grid, 2013.
[14] Miroslav Fiedler. Algebraic connectivity of graphs.Czechoslovak mathematical journal, 23 (2):298-305, 1973. · Zbl 0265.05119
[15] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. Sparse inverse covariance estimation with the graphical lasso.Biostatistics, 9(3):432-441, 2008.doi: 10.1093/ biostatistics/kxm045. URLhttp://biostatistics.oxfordjournals.org/content/9/ 3/432.abstract. · Zbl 1143.62076
[16] Kayo Fujimoto and Thomas W Valente. Social network influences on adolescent substance use: disentangling structural equivalence from cohesion.Social Science & Medicine, 74 (12):1952-1960, 2012.
[17] Gene H Golub, Michael Heath, and Grace Wahba. Generalized cross-validation as a method for choosing a good ridge parameter.Technometrics, 21(2):215-223, 1979. · Zbl 0461.62059
[18] Jian Guo, Elizaveta Levina, George Michailidis, and Ji Zhu. Joint estimation of multiple graphical models.Biometrika, page asq060, 2011. · Zbl 1214.62058
[19] David Hallac, Jure Leskovec, and Stephen Boyd. Network lasso: Clustering and optimization in large graphs. InProceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 387-396. ACM, 2015.
[20] Dana L Haynie. Delinquent peers revisited: Does network structure matter?American Journal of Sociology, 106(4):1013-1057, 2001.
[21] Toby Dylan Hocking, Armand Joulin, Francis Bach, and Jean-Philippe Vert. Clusterpath an algorithm for clustering using convex fusion penalties. 2011.
[22] Cho-Jui Hsieh, M´aty´as A Sustik, Inderjit S Dhillon, Pradeep K Ravikumar, and Russell Poldrack. Big & quic: Sparse inverse covariance estimation for a million variables. In Advances in neural information processing systems, pages 3165-3173, 2013a.
[23] Cho-Jui Hsieh, Mtys Sustik, Inderjit S. Dhillon, Pradeep Ravikumar, and Russell A. Poldrack. Big & quic: Sparse inverse covariance estimation for a million variables. InNeural Information Processing Systems (NIPS), dec 2013b.
[24] Cho-Jui Hsieh, M´aty´as A Sustik, Inderjit S Dhillon, and Pradeep Ravikumar.Quic: quadratic approximation for sparse inverse covariance estimation.The Journal of Machine Learning Research, 15(1):2911-2947, 2014. · Zbl 1319.65048
[25] Pengsheng Ji and Jiashun Jin. Coauthorship and citation networks for statisticians.The Annals of Applied Statistics, 10(4):1779-1812, 2016. · Zbl 1454.62541
[26] Alexander Jung, Nguyen Tran, and Alexandru Mara. When is network lasso accurate? Frontiers in Applied Mathematics and Statistics, 3:28, 2018.
[27] Ioannis Koutis, Gary L Miller, and Richard Peng. Approaching optimality for solving sdd linear systems. InFoundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, pages 235-244. IEEE, 2010.
[28] Steffen L Lauritzen.Graphical models, volume 17. Clarendon Press, 1996. · Zbl 0907.62001
[29] Lung-fei Lee. Identification and estimation of econometric models with group interactions, contextual factors and fixed effects.Journal of Econometrics, 140(2):333-374, 2007. · Zbl 1247.91140
[30] Wonyul Lee and Yufeng Liu. Simultaneous multiple response regression and inverse covariance matrix estimation via penalized gaussian maximum likelihood.Journal of multivariate analysis, 111:241-255, 2012. · Zbl 1259.62043
[31] Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman.Mining of massive datasets. Cambridge University Press, 2014.
[32] Caiyan Li and Hongzhe Li. Network-constrained regularization and variable selection for analysis of genomic data.Bioinformatics, 24(9):1175-1182, 2008.
[33] Caiyan Li and Hongzhe Li. Variable selection and regression analysis for graph-structured covariates with an application to genomics.The Annals of Applied Statistics, 4(3):1498, 2010. · Zbl 1202.62157
[34] Ker-Chau Li. Asymptotic optimality of cl and generalized cross-validation in ridge regression with application to spline smoothing.The Annals of Statistics, pages 1101-1112, 1986. · Zbl 0629.62043
[35] Tianxi Li, Sharmodeep Bhattacharyya, Purnamrita Sarkar, Peter J Bickel, and Elizaveta Levina. Hierarchical community detection by recursive bi-partitioning.arXiv preprint arXiv:1810.01509, 2018.
[36] Tianxi Li, Elizaveta Levina, and Ji Zhu. Prediction models for network-linked data.The Annals of Applied Statistics, 13(1):132-164, 2019. · Zbl 1417.62354
[37] Jiahe Lin, Sumanta Basu, Moulinath Banerjee, and George Michailidis. Penalized maximum likelihood estimation of multi-layered gaussian graphical models.J. Mach. Learn. Res., 17(1):5097-5147, January 2016. ISSN 1532-4435. · Zbl 1392.62155
[38] Fredrik Lindsten, Henrik Ohlsson, and Lennart Ljung.Just relax and come clustering!: A convexification of k-means clustering. Link¨oping University Electronic Press, 2011.
[39] Jianyu Liu, Guan Yu, and Yufeng Liu. Graph-based sparse linear discriminant analysis for high-dimensional classification.Journal of Multivariate Analysis, 171:250-269, 2019. · Zbl 1417.62173
[40] Charles F Manski. Identification of endogenous social effects: The reflection problem.The Review of Economic Studies, 60(3):531-542, 1993. · Zbl 0800.90377
[41] Kanti V Mardia. Some properties of clasical multi-dimesional scaling.Communications in Statistics-Theory and Methods, 7(13):1233-1241, 1978. · Zbl 0403.62033
[42] Nicolai Meinshausen and Peter B¨uhlmann. High-dimensional graphs and variable selection with the lasso.The annals of statistics, pages 1436-1462, 2006. · Zbl 1113.62082
[43] Lynn Michell and Patrick West. Peer pressure to smoke: the meaning depends on the method.Health Education Research, 11(1):39-49, 1996.
[44] Richard A Olshen and Bala Rajaratnam. Successive normalization of rectangular arrays. Annals of statistics, 38(3):1638, 2010. · Zbl 1189.62109
[45] Wei Pan, Benhuai Xie, and Xiaotong Shen. Incorporating predictor network in penalized regression with application to microarray data.Biometrics, 66(2):474-484, 2010. · Zbl 1192.62235
[46] Michael Pearson and Patrick West. Drifting smoke rings.Connections, 25(2):59-76, 2003.
[47] Bogdan Raducanu and Fadi Dornaika. A supervised non-linear dimensionality reduction approach for manifold learning.Pattern Recognition, 45(6):2432-2444, 2012. · Zbl 1234.68345
[48] Pradeep Ravikumar, Martin J Wainwright, Garvesh Raskutti, and Bin Yu.Highdimensional covariance estimation by minimizing‘1-penalized log-determinant divergence.Electronic Journal of Statistics, 5:935-980, 2011. · Zbl 1274.62190
[49] Zhao Ren, Tingni Sun, Cun-Hui Zhang, Harrison H Zhou, et al. Asymptotic normality and optimalities in estimation of large gaussian graphical models.The Annals of Statistics, 43(3):991-1026, 2015. · Zbl 1328.62342
[50] Adam J Rothman, Peter J Bickel, Elizaveta Levina, and Ji Zhu.Sparse permutation invariant covariance estimation.Electronic Journal of Statistics, 2:494-515, 2008. · Zbl 1320.62135
[51] Adam J Rothman, Elizaveta Levina, and Ji Zhu. Sparse multivariate regression with covariance estimation.Journal of Computational and Graphical Statistics, 19(4):947-962, 2010.
[52] Veeranjaneyulu Sadhanala, Yu-Xiang Wang, and Ryan J Tibshirani. Graph sparsification approaches for laplacian smoothing. InProceedings of the 19th International Conference on Artificial Intelligence and Statistics, pages 1250-1259, 2016.
[53] Xiaotong Shen, Hsin-Cheng Huang, and Wei Pan. Simultaneous supervised clustering and feature selection over a graph.Biometrika, 99(4):899-914, 2012. · Zbl 1452.62467
[54] Ali Shojaie and George Michailidis. Penalized principal component regression on graphs for analysis of subnetworks. InAdvances in neural information processing systems, pages 2155-2163, 2010.
[55] Martin Slawski, Wolfgang zu Castell, Gerhard Tutz, et al. Feature selection guided by structural information.The Annals of Applied Statistics, 4(2):1056-1080, 2010. · Zbl 1194.62092
[56] Alexander J Smola and Risi Kondor. Kernels and regularization on graphs. InLearning theory and kernel machines, pages 144-158. Springer, 2003.
[57] Daniel A Spielman. Algorithms, graph theory, and linear equations in laplacian matrices. In Proceedings of the International Congress of Mathematicians, volume 4, pages 2698-2722, 2010. · Zbl 1241.65033
[58] Hokeun Sun, Wei Lin, Rui Feng, and Hongzhe Li. Network-regularized high-dimensional cox regression for analysis of genomic data.Statistica Sinica, 24(3):1433, 2014. · Zbl 06431838
[59] Kean Ming Tan and Daniela Witten. Statistical properties of convex clustering.Electronic journal of statistics, 9(2):2324, 2015. · Zbl 1336.62193
[60] Minh Tang, Daniel L Sussman, and Carey E Priebe. Universally consistent vertex classification for latent positions graphs.The Annals of Statistics, 41(3):1406-1430, 2013. · Zbl 1273.62147
[61] Robert Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society. Series B (Methodological), pages 267-288, 1996. · Zbl 0850.62538
[62] Robert Tibshirani, Guenther Walther, and Trevor Hastie. Estimating the number of clusters in a data set via the gap statistic.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2):411-423, 2001. · Zbl 0979.62046
[63] Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. Sparsity and smoothness via the fused lasso.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1):91-108, 2005. · Zbl 1060.62049
[64] Nguyen Tran, Henrik Ambos, and Alexander Jung. A network compatibility condition for compressed sensing over complex networks. In2018 IEEE Statistical Signal Processing Workshop (SSP), pages 50-54. IEEE, 2018.
[65] Elif Vural and Christine Guillemot. Out-of-sample generalizations for supervised manifold learning for classification.IEEE Transactions on Image Processing, 25(3):1410-1424, 2016. · Zbl 1408.94664
[66] Martin J Wainwright. Sharp thresholds for high-dimensional and noisy recovery of sparsity using l1-constrained quadratic programming.IEEE Transactions on Information Theory, 2009.
[67] Yu-Xiang Wang, James Sharpnack, Alex Smola, and Ryan J Tibshirani. Trend filtering on graphs.arXiv preprint arXiv:1410.7690, 2014.
[68] Daniela M Witten, Jerome H Friedman, and Noah Simon. New insights and faster computations for the graphical lasso.Journal of Computational and Graphical Statistics, 20(4): 892-900, 2011.
[69] Daniela M Witten, Ali Shojaie, and Fan Zhang. The cluster elastic net for high-dimensional regression with unknown variable grouping.Technometrics, 56(1):112-122, 2014.
[70] Jaewon Yang, Julian McAuley, and Jure Leskovec. Community detection in networks with node attributes. In2013 IEEE 13th International Conference on Data Mining, pages 1151-1156. IEEE, 2013.
[71] Wankou Yang, Changyin Sun, and Lei Zhang.A multi-manifold discriminant analysis method for image feature extraction.Pattern Recognition, 44(8):1649-1657, 2011. · Zbl 1218.68158
[72] Jianxin Yin and Hongzhe Li. A sparse conditional gaussian graphical model for analysis of genetical genomics data.The annals of applied statistics, 5(4):2630, 2011. · Zbl 1234.62151
[73] Jianxin Yin and Hongzhe Li. Adjusting for high-dimensional covariates in sparse precision matrix estimation by‘1-penalization.Journal of multivariate analysis, 116:365-381, 2013. · Zbl 1277.62146
[74] Ming Yuan and Yi Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1):49-67, 2006. · Zbl 1141.62030
[75] Ming Yuan and Yi Lin. Model selection and estimation in the gaussian graphical model. Biometrika, 94(1):19-35, 2007. · Zbl 1142.62408
[76] Sen Zhao and Ali Shojaie. A significance test for graph-constrained estimation.Biometrics, 72(2):484-493, 2016. · Zbl 1419.62493
[77] Tuo Zhao, Han Liu, Kathryn Roeder, John Lafferty, and Larry Wasserman. The huge package for high-dimensional undirected graph estimation in r.Journal of Machine Learning Research, 13(Apr):1059-1062, 2012. · Zbl 1283.68311
[78] Dengyong Zhou, Jiayuan Huang, and Bernhard Sch¨olkopf.Learning from labeled and unlabeled data on a directed graph. InProceedings of the 22nd international conference on Machine learning, pages 1036-1043. ACM, 2005.
[79] Shuheng Zhou, John Lafferty, and Larry Wasserman. Time varying undirected graphs. Machine Learning, 80(2-3):295-319, 2010. · Zbl 1475.62174
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.