×

Estimating a network from multiple noisy realizations. (English) Zbl 1409.62115

Summary: Complex interactions between entities are often represented as edges in a network. In practice, the network is often constructed from noisy measurements and inevitably contains some errors. In this paper we consider the problem of estimating a network from multiple noisy observations where edges of the original network are recorded with both false positives and false negatives. This problem is motivated by neuroimaging applications where brain networks of a group of patients with a particular brain condition could be viewed as noisy versions of an unobserved true network corresponding to the disease. The key to optimally leveraging these multiple observations is to take advantage of network structure, and here we focus on the case where the true network contains communities. Communities are common in real networks in general and in particular are believed to be presented in brain networks. Under a community structure assumption on the truth, we derive an efficient method to estimate the noise levels and the original network, with theoretical guarantees on the convergence of our estimates. We show on synthetic networks that the performance of our method is close to an oracle method using the true parameter values, and apply our method to fMRI brain data, demonstrating that it constructs stable and plausible estimates of the population network.

MSC:

62H12 Estimation in multivariate analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62F12 Asymptotic properties of parametric estimators
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] E. Abbe. Community detection and stochastic block models: Recent developments., Journal of Machine Learning Research, 18(177):1-86, 2018. · Zbl 1403.62110
[2] A. A. Amini, A. Chen, P. J. Bickel, and E. Levina. Fitting community models to large sparse networks., Annals of Statistics, 41(4) :2097-2122, 2013. · Zbl 1277.62166
[3] S. Balakrishnan, M. J. Wainwright, and B. Yu. Statistical guarantees for the EM algorithm: From population to sample-based analysis., Annals of Statistics, 45(1):77-120, 2017. · Zbl 1367.62052
[4] D. S. Bassett, B. G. Nelson, B. A. Mueller, J. Camchong, and K. O. Lim. Altered resting state complexity in schizophrenia., Neuroimage, 59(3) :2196-2207, 2012.
[5] P. J. Bickel and A. Chen. A nonparametric view of network models and Newman-Girvan and other modularities., Proc. Natl. Acad. Sci. USA, 106 :21068-21073, 2009. · Zbl 1359.62411
[6] P. J. Bickel and K. A. Doksum., Mathematical statistics: Basic ideas and selected topics-2nd edition (updated printing), volume 1. Pearson Prentice Hall, 2007. · Zbl 0403.62001
[7] S. Boucheron, G. Lugosi, and P. Massart., Concentration inequalities: a nonasymptotic theory of independence. Oxford University Press, 2013. · Zbl 1279.60005
[8] R. L. Buckner, J. Sepulcre, T. Talukdar, F. Krienen, H. Liu, T. Hedden, J. R. Andrews-Hanna, R. A. Sperling, and K. A. Johnson. Cortical hubs revealed by intrinsic functional connectivity: Mapping, assessment of stability, and relation to alzheimer’s disease., J Neurosci, 29(6) :1860-1873, 2009.
[9] K. Chen and J. Lei. Network cross-validation for determining the number of communities in network data., arXiv :1411.1715, 2014. · Zbl 1398.62159
[10] P. Chin, A. Rao, and V. Vu. Stochastic block model and community detection in the sparse graphs: A spectral algorithm with optimal rate of recovery., arXiv :1501.05021, 2015.
[11] D. Choi and P. Wolfe. Co-clustering separately exchangeable network data., Annals of Statistics, 42(1):29-63, 2014. · Zbl 1294.62059
[12] A. P. Dawid and A. M. Skene. Maximum likelihood estimation of observer error-rates using the em algorithm., Applied statistics, pages 20-28, 1979.
[13] S. Fortunato. Community detection in graphs., Physics Reports, 486(3-5):75 – 174, 2010.
[14] C. Gao, Z. Ma, A. Y. Zhang, and H. H. Zhou. Achieving optimal misclassification proportion in stochastic block model., arXiv :1505.03772, 2015. · Zbl 1440.62244
[15] A. Ghosh, S. Kale, and R. P. McAfee. Who moderates the moderators?: crowdsourcing abuse detection in user-generated content. In, EC, 2011.
[16] A. Goldenberg, A. X. Zheng, S. E. Fienberg, and E. M. Airoldi. A survey of statistical network models., Foundations and Trends in Machine Learning, 2:129-233, 2010. · Zbl 1184.68030
[17] P. W. Holland, K. B. Laskey, and S. Leinhardt. Stochastic blockmodels: first steps., Social Networks, 5(2):109-137, 1983.
[18] S. H. Hosseini and S. R. Kesler. Comparing connectivity pattern and small-world organization between structural correlation and resting-state networks in healthy adults., Neuroimage, 78:402-414, 2013.
[19] A. Joseph and B. Yu. Impact of regularization on spectral clustering., Ann. Statist., 44(4) :1765-1791, 2016. · Zbl 1357.62229
[20] B. Karrer and M. E. J. Newman. Stochastic blockmodels and community structure in networks., Physical Review E, 83 :016107, 2011.
[21] F. Klimm, D. S. Bassett, J. M. Carlson, and P. J. Mucha. Resolving structural variability in network models and the brain., PLoS Comput Biol, 10(3): e1003491, 2014.
[22] V. Koltchinskii., Oracle inequalities in empirical risk minimization and sparse recovery problems. Springer Verlag, 2011. · Zbl 1223.91002
[23] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborová, and P. Zhang. Spectral redemption in clustering sparse networks., Proceedings of the National Academy of Sciences, 110(52) :20935-20940, 2013. · Zbl 1359.62252
[24] C. M. Le and E. Levina. Estimating the number of communities in networks by spectral methods., arXiv :1507.00827, 2015. · Zbl 1493.62313
[25] C. M. Le, E. Levina, and R. Vershynin. Sparse random graphs: regularization and concentration of the laplacian., arXiv :1502.03049, 2015. · Zbl 1373.05179
[26] C. M. Le, E. Levina, and R. Vershynin. Concentration and regularization of random graphs., Random Struct. Alg., doi: 10.1002/rsa.20713, 2017. · Zbl 1373.05179
[27] M. Ledoux and M. Talagrand., Probability in Banach spaces: Isoperimetry and processes. Springer-Verlag, Berlin, 1991. · Zbl 0748.60004
[28] K. Levin, A. Athreya, M. Tang, V. Lyzinski, and C. E. Priebe. A central limit theorem for an omnibus embedding of random dot product graphs., arXiv :1705.09355, 2017.
[29] Q. Liu, J. Peng, and A. T. Ihler. Variational inference for crowdsourcing., Advances in neural information processing systems, pages 692-700, 2012.
[30] V. Lyzinski, D. L. Sussman, M. Tang, A. Athreya, and C. E. Priebe. Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding., Electronic Journal of Statistics, 8(2) :2905-2922, 2014. · Zbl 1308.62131
[31] M. Narayan and G. I. Allen. Mixed effects models for resampled network statistics improves statistical power to find differences in multi-subject functional connectivity., Frontiers in Neuroscience, 10(108), 2016.
[32] M. Narayan, G. I. Allen, and S. Tomson. Two sample inference for populations of graphical models with applications to functional connectivity., arXiv :1502.03853, 2015.
[33] M. E. J. Newman. Fast algorithm for detecting community structure in networks., Phys. Rev. E, 69(6) :066133, 2004.
[34] M. E. J. Newman. Modularity and community structure in networks., Proc. Natl. Acad. Sci. USA, 103(23) :8577-8582, 2006.
[35] S. C. Olhede and P. J. Wolfe. Network histograms and universality of blockmodel approximation., Proceedings of the National Academy of Sciences of the USA, 111 :14722-14727, 2014.
[36] J. D. Power, A. L. Cohen, S. M. Nelson, G. S. Wig, K. A. Barnes, J. A. Church, A. C. Vogel, T. O. Laumann, F. M. Miezin, B. L. Schlaggar, and S. E. Petersen. Functional network organization of the human brain., Neuron, 72(4):665-78, 2011.
[37] V. C. Raykar, S. Yu, L. H. Zhao, G. H. Valadez, C. Florin, L. Bogoni, and L. Moy. Learning from crowds., Journal of Machine Learning Research, 11 :1297-1322, 2010.
[38] M. Rubinov and O. Sporns. Complex network measures of brain connectivity: uses and interpretations., NeuroImage, 52(3) :1059-1069, 2010.
[39] R. Tang, M. Ketcha, J. T. Vogelstein, C. E. Priebe, and D. L. Sussman. Law of large graphs., arXiv :1609.01672, 2016.
[40] S. Taylor, A. Chen, I. Tso, I. Liberzon, and R. Welsh. Social appraisal in chronic psychosis: Role of medial frontal and occipital networks., J. Psychiatr. Res., 45(4):526-538, 2011.
[41] S. Taylor, E. Demeter, K. Phan, I. Tso, and R. Welsh. Abnormal gabaergic function and negative affect in schizophrenia., Neuropsychopharmacology, 39(4) :1000-1008, 2014.
[42] J. A. Tropp. User-friendly tail bounds for sums of random matrices., Foundations of Computational Mathematics, 12(4):389-434, 2012. · Zbl 1259.60008
[43] L. Wang, Z. Zhang, and D. Dunson. Common and individual structure of multiple networks., arXiv :1707.06360, 2017. · Zbl 1417.62332
[44] R. Wang and P. Bickel. Likelihood-based model selection for stochastic block models., arXiv :1502.02069, 2015.
[45] M. Xia, J. Wang, and Y. He. Brainnet viewer: A network visualization tool for human brain connectomics., PLoS ONE 8: e68910, 2013.
[46] Y. Zhang, X. Chen, D. Zhou, and M. I. Jordan. Spectral methods meet em: A provably optimal algorithm for crowdsourcing., Advances in neural information processing systems, page 1260-1268, 2014. · Zbl 1367.68241
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.