×

Modeling network populations via graph distances. (English) Zbl 1524.62123

Summary: This article introduces a new class of models for multiple networks. The core idea is to parameterize a distribution on labeled graphs in terms of a Fréchet mean graph (which depends on a user-specified choice of metric or graph distance) and a parameter that controls the concentration of this distribution about its mean. Entropy is the natural parameter for such control, varying from a point mass concentrated on the Fréchet mean itself to a uniform distribution over all graphs on a given vertex set. We provide a hierarchical Bayesian approach for exploiting this construction, along with straightforward strategies for sampling from the resultant posterior distribution. We conclude by demonstrating the efficacy of our approach via simulation studies and two multiple-network data analysis examples: one drawn from systems biology and the other from neuroscience.

MSC:

62F15 Bayesian inference
05C12 Distance in graphs
05C22 Signed and weighted graphs
05C80 Random graphs (graph-theoretic aspects)
05C90 Applications of graph theory
62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

shallot
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Andrieu, C.; Roberts, G. O., “The Pseudo-Marginal Approach for Efficient Monte Carlo Computations, The Annals of Statistics, 37, 697-725 (2009) · Zbl 1185.60083 · doi:10.1214/07-AOS574
[2] Arroyo, J., Athreya, A., Cape, J., Chen, G., Priebe, C. E., and Vogelstein, J. T. (2019), “Inference for Multiple Heterogeneous Networks With Common Invariant Subspace,” arXiv no. 1906.10026. · Zbl 07415085
[3] Balachandrian, P.; Kolaczyk, E. D.; Viles, W. D., “On the Propagation of Low-Rate Measurement Error to Subgraph Counts in Large Networks,”, Journal of Machine Learning and Research, 18, 1-33 (2017) · Zbl 1440.62226
[4] Bartlett, T. E.; Olhede, S. C.; Zaikin, A., “A DNA Methylation Network Interaction Measure and Detection of Network Oncomarkers, PLoS One, 9, e84573 (2014) · doi:10.1371/journal.pone.0084573
[5] Bhattacharyya, S., and Chatterjee, S. (2018), “Spectral Clustering for Multiple Sparse Networks: I,” arXiv no. 1805.10594.
[6] Biswal, B. B.; Menness, M.; Zuo, X.-N., “Toward Discovering Science of Human Brain Function, Proceedings of the National Academy of Sciences of the United States of America, 107, 4734-4739 (2010) · doi:10.1073/pnas.0911855107
[7] Brooks, S. P.; Giudici, P.; Roberts, G. O., “Efficient Construction of Reversible Jump Markov Chain Monte Carlo Proposal Distributions, Journal of the Royal Statistical Society, Series B, 65, 3-39 (2003) · Zbl 1063.62120 · doi:10.1111/1467-9868.03711
[8] Chang, J., Kolaczyk, E. D., and Yao, Q. (2018), “Estimation of Edge Density in Noisy Networks,” arXiv no. 1803.02488v1.
[9] Chung, F. K. R., Spectral Graph Theory (1997), Providence, RI: American Mathematical Society, Providence, RI · Zbl 0867.05046
[10] Craddock, R. C.; James, G.; Holtzheimer, P. E.; Hu, X. P.; Mayberg, H. S., “A Whole Brain FMRI Atlas Generated via Spatially Constrained Spectral Clustering, Human Brain Mapping, 33, 1914-1928 (2012) · doi:10.1002/hbm.21333
[11] Dahl, D. B.; Day, R.; Tsai, J. W., “Random Partition Distribution Indexed by Pairwise Information, Journal of the American Statistical Association, 112, 721-732 (2017) · doi:10.1080/01621459.2016.1165103
[12] Donnat, C.; Holmes, S., “Tracking Network Dynamics: A Survey Using Graph Distances, The Annals of Applied Statistics, 12, 971-1012 (2018) · Zbl 1405.62171 · doi:10.1214/18-AOAS1176
[13] Dryden, I. L.; Mardia, K. V., Statistical Shape Analysis (1998), Chichester: Wiley, Chichester · Zbl 0901.62072
[14] Durante, D.; Dunson, D. B.; Vogelstein, J. T., “Nonparametric Bayes Modeling of Populations of Networks, Journal of the American Statistical Association, 112, 1516-1530 (2017) · doi:10.1080/01621459.2016.1219260
[15] Fang, K. T.; Kotz, S.; Ng, K. W., Symmetric Multivariate and Related Distributions (1990), London: Chapman and Hall Ltd, London · Zbl 0699.62048
[16] Feragen, A.; Hauberg, S.; Nielsen, M.; Lauze, F., “Means in Space of Tree-Like Shapes,” in 2011 International Conference on Computer Vision (2011)
[17] Fréchet, M., “Les éléments aléatories de nature quelconque dans un espace distancié, Annales de l’Institut Henri Poincaré, 10, 103-111 (1948)
[18] Gelman, A.; Meng, X.-L.; Stern, H., “Posterior Predictive Assessment of Model Fitness via Realized Discrepancies,”, Statistica Sinica, 6, 733-807 (1996) · Zbl 0859.62028
[19] Ghosal, S.; van der Vaart, A., Fundamentals of Nonparametric Bayesian Inference, Cambridge Series in Statistical and Probabilistic Mathematics, 44 (2017), Cambridge: Cambridge University Press, Cambridge · Zbl 1376.62004
[20] Ginestet, C. E.; Li, J.; Balachandrian, P.; Rosenberg, S.; Kolaczyk, E. D., “Hypothesis Testing for Network Data in Functional Neuroimaging, The Annals of Applied Statistics, 11, 725-750 (2017) · Zbl 1391.62217 · doi:10.1214/16-AOAS1015
[21] Gollini, I.; Murphy, T. B., “Joint Modeling of Multiple Network Views, Journal of Computational and Graphical Statistics, 25, 246-265 (2016) · doi:10.1080/10618600.2014.978006
[22] Hajek, B.; Wu, Y.; Xu, J., “Information Limits for Recovering a Hidden Community, IEEE Transactions on Information Theory, 63, 4729-4745 (2017) · Zbl 1372.94364 · doi:10.1109/TIT.2017.2653804
[23] Hammond, D. K., Gur, Y., and Johnson, C. R. (2013), “Graph Diffusion Distance: A Difference Measure for Weighted Graphs Based on the Graph Laplacian Exponential Kernel,” in 2013 Global Conference on Signal and Information Processing (GlobalSIP), IEEE.
[24] Hoff, P. D.; Raftery, A. E.; Handcock, M. S., “Latent Space Approaches to Social Network Analysis, Journal of the American Statistical Association, 97, 1090-1098 (2002) · Zbl 1041.62098 · doi:10.1198/016214502388618906
[25] Johnson, V. E., A Bayesian \(####\) Test for Goodness-of-Fit,, The Annals of Statistics, 32, 29-46 (2004)
[26] Kolaczyk, E. D.; Lin, L.; Rosenberg, S.; Walters, J., “Averages of Unlabeled Networks: Geometric Characterization and Asymptotic Behavior, The Annals of Statistics, 48, 514-538 (2020) · Zbl 1439.62068 · doi:10.1214/19-AOS1820
[27] Le, C. M.; Levin, K.; Levina, E., “Estimating a Network From Multiple Noisy Realizations, Electronic Journal of Statistics, 12, 4697-4740 (2018) · Zbl 1409.62115 · doi:10.1214/18-EJS1521
[28] Lynall, M.-E.; Bassett, D. S.; Kerwin, R., “Functional Connectivity and Brain Networks in Schizophrenia, Journal of Neuroscience, 30, 9477-9487 (2010) · doi:10.1523/JNEUROSCI.0333-10.2010
[29] Mardia, K. V.; Dryden, I. L., Technical Report, Department of Statistics, “The Complex Watson Distribution and Shape Analysis,” (1998), University of Leeds · Zbl 0940.62048
[30] Mézard, M.; Montanari, A., Information, Physics, and Computation (2009), New York: Oxford University Press, New York · Zbl 1163.94001
[31] Mitra, R.; Müller, P.; Liang, S.; Yue, L.; Ji, Y., “A Bayesian Graphical Model for ChiP-Seq Data on Histone Modifications, Journal of the American Statistical Association, 108, 69-80 (2013) · Zbl 1379.62079 · doi:10.1080/01621459.2012.746058
[32] Møller, J.; Pettitt, A. N.; Reeves, R.; Berthelsen, K. K., “An Efficient Markov Chain Monte Carlo Method for Distributions With Intractable Normalising Constants, Biometrika, 93, 451-458 (2006) · Zbl 1158.62020 · doi:10.1093/biomet/93.2.451
[33] Mukherjee, S.; Speed, T. P., “Network Inference Using Informative Priors, Proceedings of the National Academy of Sciences of the United States of America, 105, 14313-14318 (2008) · doi:10.1073/pnas.0802272105
[34] Nelson, B. G.; Bassett, D. S.; Camchong, J.; Bullmore, E. T.; Lim, K. O., “Comparison of Large-Scale Human Brain Functional and Anatomical Networks in Schizophrenia, NeuroImage, 15, 439-448 (2017) · doi:10.1016/j.nicl.2017.05.007
[35] Newman, M. E. J. (2018), “Network Reconstruction and Error Estimation With Noisy Network Data,” arXiv no. 1703.07376v2.
[36] Ni, Y.; Müller, P.; Zhu, Y.; Ji, Y., “Heterogeneous Reciprocal Graphical Models, Biometrics, 74, 606-615 (2018) · Zbl 1414.62466 · doi:10.1111/biom.12791
[37] Nielsen, A., and Witten, D. (2018), “The Multiple Random Dot Product Graph Model,” arXiv no. 1811.12172v1.
[38] Peixoto, T. P. (2018a), “Reconstructing Networks With Unknown and Heterogeneous Errors,” arXiv no. 1806.07956.
[39] Peixoto, T. P., “Network Structure From Rich But Noisy Data, Nature, Physics, 14, 542-545 (2018)
[40] Pennec, X., “Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements, Journal of Mathematical Imaging and Vision, 25, 127-154 (2006) · Zbl 1478.94072 · doi:10.1007/s10851-006-6228-4
[41] Srivastava, A.; Klassen, E. C., Functional and Shape Data Analysis (2016), New York: Springer-Verlag, New York · Zbl 1376.62003
[42] Tan, L.; Jasra, A.; De Iorio, M.; Ebbels, T. M. D., “Bayesian Inference for Multiple Gaussian Graphical Models With Application to Metabolic Association Networks, The Annals of Applied Statistics, 11, 2222-2251 (2017) · Zbl 1383.62294 · doi:10.1214/17-AOAS1076
[43] Tang, M.; Athreya, A.; Sussman, D. L.; Lyzinski, V.; Park, Y.; Priebe, C., “A Semiparametric Two-Sample Hypothesis Testing Problem for Random Graphs, Journal of Computational and Graphical Statistics, 26, 344-354 (2017) · doi:10.1080/10618600.2016.1193505
[44] Wade, S.; Ghahramani, Z., “Bayesian Cluster Analysis: Point Estimation and Credible Balls, Bayesian Analysis, 13, 559-626 (2018) · Zbl 1407.62241 · doi:10.1214/17-BA1073
[45] Wang, S., Vogelstein, J. T., and Priebe, C. E. (2017), “Joint Embedding of Graphs,” arXiv no. 1703.03862.
[46] Wu, C., and Robert, C. (2017), “Average of Recentered Parallel MCMC for Big Data,” arXiv no. 1706.04780v2.
[47] Zelinka, B., “On a Certain Distance Between Isomorphism Classes of Graphs, Časopis Pro Pěstování Matematiky, 100, 371-373 (1975) · Zbl 0312.05121 · doi:10.21136/CPM.1975.117890
[48] Zuo, X.-N.; Anderson, J. S.; Bellec, P.; Birn, R. M.; Biswal, B. B.; Blautzik, J.; Breitner, J. C.; Buckner, R. L.; Calhoun, V. D.; Castellanos, F. X.; Chen, A.; Chen, B.; Chen, J.; Chen, X.; Colcombe, S. J.; Courtney, W.; Craddock, R. C.; Di Martino, A.; Dong, H.-M.; Fu, X.; Gong, Q.; Gorgolewski, K. J.; Han, Y.; He, Y.; He, Y.; Ho, E.; Holmes, A.; Hou, X.-H.; Huckins, J.; Jiang, T.; Jiang, Y.; Kelley, W.; Kelly, C.; King, M.; LaConte, S. M.; Lainhart, J. E.; Lei, X.; Li, H.-J.; Li, K.; Li, K.; Lin, Q.; Liu, D.; Liu, J.; Liu, X.; Liu, Y.; Lu, G.; Lu, J.; Luna, B.; Luo, J.; Lurie, D.; Mao, Y.; Margulies, D. S.; Mayer, A. R.; Meindl, T.; Meyerand, M. E.; Nan, W.; Nielsen, J. A.; O’Connor, D.; Paulsen, D.; Prabhakaran, V.; Qi, Z.; Qiu, J.; Shao, C.; Shehzad, Z.; Tang, W.; Villringer, A.; Wang, H.; Wang, K.; Wei, D.; Wei, G.-X.; Weng, X.-C.; Wu, X.; Xu, T.; Yang, N.; Yang, Z.; Zang, Y.-F.; Zhang, L.; Zhang, Q.; Zhang, Z.; Zhang, Z.; Zhao, K.; Zhen, Z.; Zhou, Y.; Zhu, X.-T.; Milham, M. P., “An Open Science Resource for Establishing Reliability and Reproducibility in Functional Connectomics, Scientific Data, 1, 140049 (2014) · doi:10.1038/sdata.2014.49
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.