×

zbMATH — the first resource for mathematics

Vertex nomination: the canonical sampling and the extended spectral nomination schemes. (English) Zbl 07178860
Summary: Suppose that one particular block in a stochastic block model is of interest, but block labels are only observed for a few of the vertices in the network. Utilizing a graph realized from the model and the observed block labels, the vertex nomination task is to order the vertices with unobserved block labels into a ranked nomination list with the goal of having an abundance of interesting vertices near the top of the list. There are vertex nomination schemes in the literature, including the optimally precise canonical nomination scheme \(\mathcal{L}^C\) and the consistent spectral partitioning nomination scheme \(\mathcal{L}^P\). While the canonical nomination scheme \(\mathcal{L}^C\) is provably optimally precise, it is computationally intractable, being impractical to implement even on modestly sized graphs. With this in mind, an approximation of the canonical scheme – denoted the canonical sampling nomination scheme\( \mathcal{L}^{C S} \) – is introduced; \( \mathcal{L}^{C S}\) relies on a scalable, Markov chain Monte Carlo-based approximation of \(\mathcal{L}^C\), and converges to \(\mathcal{L}^C\) as the amount of sampling goes to infinity. The spectral partitioning nomination scheme is also extended to the extended spectral partitioning nomination scheme, \( \mathcal{L}^{E P} \), which introduces a novel semisupervised clustering framework to improve upon the precision of \(\mathcal{L}^P\). Real-data and simulation experiments are employed to illustrate the precision of these vertex nomination schemes, as well as their empirical computational complexity.

MSC:
62-XX Statistics
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Agterberg, J.; Park, Y.; Larson, J.; White, C.; Priebe, C. E.; Lyzinski, V., Vertex nomination, consistent estimation, and adversarial modification (2019), arXiv preprint arXiv:1905.01776
[2] Airoldi, E. M.; Blei, D. M.; Fienberg, S. E., Mixed membership stochastic blockmodels, J. Mach. Learn. Res., 9, 1981-2014 (2008) · Zbl 1225.68143
[3] Aldous, D., Fill, J.A., 2002. Reversible Markov Chains and Random Walks on Graphs, Berkeley.
[4] Andersson, J. L.R.; Skare, S.; Ashburner, J., How to correct susceptibility distortions in spin-echo echo-planar images: application to diffusion tensor imaging, NeuroImage, 20, 870-888 (2003)
[5] Athreya, A.; Lyzinski, V.; Marchette, D. J.; Priebe, C. E.; Sussman, D. L.; Tang, M., A limit theorem for scaled eigenvectors of random dot product graphs, Sankhya (2015) · Zbl 1338.62061
[6] Bickel, P.; Choi, D.; Chang, X.; Zhang, H., Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels, Ann. Statist., 41:4, 1922-1943 (2013) · Zbl 1292.62042
[7] Chatterjee, S., Matrix estimation by universal singular value thresholding, Ann. Statist., 43:1, 177-214 (2014) · Zbl 1308.62038
[8] Coppersmith, G., Vertex nomination, Wiley Interdiscip. Rev. Comput. Stat., 6:2, 144-153 (2014)
[9] Coppersmith, G. A.; Priebe, C. E., Vertex nomination via content and context (2012), arXiv preprint arXiv:1201.4118
[10] Desikan, R. S.; Segonne, F.; Fischl, B.; Quinn, B. T.; Dickerson, B. C.; Blacker, D.; Buckner, R. L.; Dale, A. M.; Maquire, R. P.; Hyman, B. T.; Albert, M. S.; Killany, R. J., An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest, NeuroImage (2006)
[11] Diaconis, P.; Saloff-Coste, L., What do we know about the metropolis algorithm?, J. Comput. System Sci., 57, 20-36 (1998) · Zbl 0920.68054
[12] Erdos, P.; Renyi, A., Asymmetric graphs, Acta Math. Acad. Sci. Hung., 14, 295-315 (1963) · Zbl 0118.18901
[13] Feller, W., An Introduction to Probability Theory and Its Applications, Vol. 2 (2008), John Wiley & Sons · Zbl 0158.34902
[14] Fishkind, D. E.; Lyzinski, V.; Pao, H.; Chen, L.; Priebe, C. E., Vertex nomination schemes for membership prediction, Ann. Appl. Stat., 9:3, 1510-1532 (2015) · Zbl 1454.62180
[15] Fishkind, D. E.; Sussman, D. L.; Tang, M.; Vogelstein, J. T.; Priebe, C. E., Consistent adjacency-spectral partitioning for the stochastic block model when the model parameters are unknown, SIAM J. Matrix Anal. Appl., 34, 23-39 (2013) · Zbl 1314.05186
[16] Fraley, C.; Raftery, A. E., Model-based clustering, discriminant analysis, and density estimation, J. Amer. Statist. Assoc., 97, 611-631 (2002) · Zbl 1073.62545
[17] Fraley, C.; Raftery, A. E., MCLUST version 3: an R package for normal mixture modeling and model-based clustering, DTIC Document (2006)
[18] Garyfallidis, E.; Brett, M.; Amirbekian, B.; Rokem, A.; Van Der Walt, S.; Descoteaux, M.; Nimmo-Smith, I., Dipy, a library for the analysis of diffusion MRI data, Front. Neuroinformatics, 8, 8 (2014)
[19] Garyfallidis, E.; Brett, M.; Correia, M. M.; Williams, G. B.; Nimmo-Smith, I., Quickbundles, a method for tractography simplification, Front. Neurosci., 6, 175 (2012)
[20] Gelman, A.; Carlin, J. B.; Stern, H. S.; Rubin, D. B., Bayesian Data Analysis, Vol. 2 (2014), Taylor & Francis
[21] Gilks, W. R.; Richardson, S.; Spiegelhalter, D., Markov Chain Monte Carlo in Practice (1995), CRC press · Zbl 0832.00018
[22] Glasser, M. F.; Coalson, T. S.; Robinson, E. C.; Hacker, C. D.; Harwell, J.; Yacoub, E.; Ugurbil, K.; Andersson, J.; Beckmann, C. F.; Jenkinson, M.; Smith, S. M.; Van Essen, D. C., A multi-modal parcellation of human cerebral cortex, Nature, 7615, 171-178 (2016)
[23] Hardy, G. H.; Littlewood, J. E.; Polya, G., Inequalities (1952), Cambridge University Press · Zbl 0047.05302
[24] Holland, P. W.; Laskey, K.; Leinhardt, S., Stochastic blockmodels: First steps, Social Networks, 5:2, 109-137 (1983)
[25] Jenkinson, M.; Beckmann, C. F.; Behrens, T. E.; Woolrich, M. W.; Smith, S. M., FSL, NeuroImage, 2, 782-790 (2012)
[26] Johndrow, J. E.; Smith, A., Fast mixing of metropolis-hastings with unimodal targets, Electron. Commun. Probab., 23, 1-9 (2018) · Zbl 1414.60054
[27] Karrer, B.; Newman, M. E.J., Stochastic blockmodels and community structure in networks, Phys. Rev. E, 83 (2011)
[28] Kaufman, L.; Rousseeuw, P. J., Finding Groups in Data: An Introduction to Cluster Analysis (2009), John Wiley & Sons
[29] Kessler, D.; Angstadt, M.; Welsh, R. C.; Sripada, C., Modality-spanning deficits in attention-deficit/hyperactivity disorder in functional networks, gray matter, and white matter, J. Neurosci., 34, 16555-16566 (2014)
[30] Kiar, G.; Bridgeford, E. W.; Roncal, W. G.; Consortium for Reliability and Reproducibility (CoRR), C.; Chandrashekhar, V.; Mhembere, D.; Ryman, S.; Zuo, X.; Margulies, D. S.; Craddock, R. C.; Priebe, C. E.; Jung, R.; Calhoun, V. D.; Caffo, B.; Burns, R.; Milham, M. P.; Vogelstein, J. T., A principled high-throughput estimation and mega-analysis pipeline for reproducible connectomics (2017), Preprint available at https://www.biorxiv.org/content/early/2017/09/14/188706
[31] Lancaster, J. L., The Talairach Daemon, a database server for Talairach atlas labels, NeuroImage (1997)
[32] Lyzinski, V.; Levin, K.; Fishkind, D. E.; Priebe, C. E., On the consistency of the likelihood maximization vertex nomination scheme: Bridging the gap between maximum likelihood estimation and graph matching, J. Mach. Learn. Res., 17, 1-34 (2016) · Zbl 1392.62189
[33] Lyzinski, V.; Levin, K.; Priebe, C. E., On consistent vertex nomination schemes, J. Mach. Learn. Res., 20, 69, 1-39 (2019) · Zbl 07064049
[34] Lyzinski, V.; Sussman, D. L.; Tang, M.; Athreya, A.; Priebe, C. E., Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding, Electron. J. Stat., 8, 2905-2922 (2014) · Zbl 1308.62131
[35] MacQueen, J., Some methods for classification and analysis of multivariate observations, (Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (1967)), 281-297 · Zbl 0214.46201
[36] Makris, N.; Goldstein, J. M.; Kennedy, D.; Hodge, S. M.; Caviness, V. S.; Faraone, S. V.; Tsuang, M. T.; Seidman, L. J., Decreased volume of left and total anterior insular lobule in schizophrenia, Schizophr. Res., 83, 2, 155-171 (2006)
[37] Marchette, D.; Priebe, C. E.; Coppersmith, G., Vertex nomination via attributed random dot product graphs, (Proceedings of the 57th ISI World Statistics Congress, vol. 6 (2011))
[38] Mazziotta, J.; Toga, A.; Evans, A.; Fox, P.; Lancaster, J.; Zilles, K.; Woods, R.; Paus, T.; Simpson, G.; Pike, B.; Holmes, C.; Collins, L.; Thompson, P.; MacDonald, D.; Iacoboni, M.; Schormann, T.; Amunts, K.; Palomero-Gallagher, N.; Geyer, S.; Parsons, L.; Narr, K.; Kabani, N.; Le Goualher, G.; Feidler, J.; Smith, K.; Boomsma, D.; Hulshoff Pol, H.; Cannon, T.; Kawashima, R.; Mazoyer, B., A four-dimensional probabilistic atlas of the human brain, J. Am. Med. Inform. Assoc., 8, 5, 401-430 (2001)
[39] McLachlan, G.; Peel, D., Finite Mixture Models (2004), John Wiley & Sons
[40] Mhembere, D., Roncal, W.G., Sussman, D.L., Priebe, C.E., Jung, R., Ryman, S., Vogelstein, R.J., Vogelstein, J.T., Burns, R., 2013. Computing scalable multivariate glocal invariants of large (brain-) graphs. In: IEEE Global Conference on Signal and Information Processing, GlobalSIP, pp. 297-300.
[41] Newman, M. E.J., Modularity and community structure in networks, Proc. Natl. Acad. Sci., 103:23, 8577-8582 (2006)
[42] Oishi, K.; Faria, A. V.; van Zijl, P. C.M.; Mori, S., MRI Atlas of Human White Matter (2010), Academic Press
[43] Olhede, S. C.; Wolfe, P. J., Network histograms and universality of block model approximation, Proc. Natl. Acad. Sci., 111, 14722-14727 (2014)
[44] Patsolic, H. G.; Park, Y.; Lyzinski, V.; Priebe, C. E., Vertex nomination via seeded graph matching (2017), ArXiv preprint available at arXiv:1705.00674
[45] Polya, G., Kombinatorische anzahlbestimmungen für gruppen, graphen und chemische verbindungen, Acta Math., 68, 145-254 (1937) · JFM 63.0547.04
[46] Qin, T.; Rohe, K., Regularized spectral clustering under the degree-corrected stochastic blockmodel, Adv. Neural Inf. Process. Syst. (2013)
[47] Rohe, K.; Chatterjee, S.; Yu, B., Spectral clustering and the high-dimensional stochastic blockmodel, Ann. Statist., 39, 1878-1915 (2011) · Zbl 1227.62042
[48] Smith, S. M.; Jenkinson, M.; Woolrich, M. W.; Beckmann, C. F.; Behrens, T. E.; Johansen-Berg, H.; Bannister, P. R.; De Luca, M.; Drobnjak, I.; Flitney, D. E.; Niazy, R. K.; Saunders, J.; Vickers, J.; Zhang, Y.; De Stefano, N.; Brady, J. M.; Matthews, P. M., Advances in functional and structural MR image analysis and implementation as FSL, NeuroImage, 23, S208-219 (2004)
[49] Sripada, C. S.; Kessler, D.; Angstadt, M., Lag in maturation of the brain’s intrinsic functional architecture in attention-deficit/hyperactivity disorder, Proc. Natl. Acad. Sci., 111:39, 14259-14264 (2014)
[50] Sun, M., Tang, M., Priebe, C.E., 2012. A comparison of graph embedding methods for vertex nomination. In: 2012 International Conference on Machine Learning and Applications, pp. 398-403.
[51] Sussman, D. L.; Tang, M.; Fishkind, D. E.; Priebe, C. E., A consistent adjacency spectral embedding for stochastic blockmodel graphs, J. Amer. Statist. Assoc., 107, 1119-1128 (2012) · Zbl 1443.62188
[52] Sussman, D. L.; Tang, M.; Priebe, C. E., Consistent latent position estimation and vertex classification for random dot product graphs, IEEE Trans. Pattern Anal. Mach. Intell., 36, 48-57 (2014)
[53] Tzourio-Mazoyer, N.; Landeau, B.; Papathanassiou, D.; Crivello, F.; Etard, O.; Delcroix, N.; Mazoyer, B.; Joliot, M., Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain, Neuroimage, 15:1, 273-289 (2002)
[54] Von Luxburg, U., A tutorial on spectral clustering, Stat. Comput., 17:4, 395-416 (2007)
[55] Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S., 2001. Constrained k-means clustering with background knowledge, In: International Conference on Machine Learning, pp. 577-584.
[56] Wang, Y. X.R.; Bickel, P. J., Likelihood-based model selection for stochastic block models, Ann. Statist., 45:2, 500-528 (2017) · Zbl 1371.62017
[57] Wang, L.; Mruczek, R. E.B.; Michael, J.; Kastner, S., Probabilistic maps of visual topography in human cortex, Cerebral Cortex, 1-21 (2014)
[58] Wang, Y. J.; Wong, G. Y., Stochastic blockmodels for directed graphs, J. Amer. Statist. Assoc., 82, 8-19 (1987) · Zbl 0613.62146
[59] Wolfe, P.; Olhede, S. C., Nonparametric graphon estimation (2013), ArXiv preprint available at
[60] Woolrich, M. W.; Jbabdi, S.; Patenaude, B.; Chappell, M.; Makni, S.; Behrens, T.; Beckmann, C.; Jenkinson, M.; Smith, S. M., Bayesian analysis of neuroimaging data in FSL, NeuroImage, 45, S173-186 (2009)
[61] Yang, C.; Priebe, C. E.; Park, Y.; Marchette, D. J., Simultaneous dimensionality and complexity model selection for spectral graph clustering (2019), ArXiv preprint available at arXiv:1904.02926
[62] Yoder, J., On model-based semi-supervised clustering (2016), Johns Hopkins University, (Ph.D. dissertation)
[63] Yoder, J.; Priebe, C. E., A model-based semi-supervised clustering methodology (2014), arXiv preprint arXiv:1412.4841
[64] Yoder, J.; Priebe, C. E., Semi-supervised k-means++, J. Stat. Comput. Simul., 87, 2597-2608 (2017)
[65] Zhu, M.; Ghodsi, A., Automatic dimensionality selection from the scree plot via the use of profile likelihood, Comput. Statist. Data Anal., 51:2, 918-930 (2006) · Zbl 1157.62429
[66] Zuo, X. N.; Anderson, J. S.; Bellec, P.; Birn, R. M.; Biswal, B. B.; Blautzik, J.; Breitner, J. C.; Buckner, R. L.; Calhoun, V. D.; Castellanos, F. X.; Chen, A.; Chen, B.; Chen, J.; Chen, X.; Colcombe, S. J.; Courtney, W.; Craddock, R. C.; Di Martino, A.; Dong, H. M.; Fu, X.; Gong, Q.; Gorgolewski, K. J.; Han, Y.; He, Y.; He, Y.; Ho, E.; Holmes, A.; Hou, X. H.; Huckins, J.; Jiang, T.; Jiang, Y.; Kelley, W.; Kelly, C.; King, M.; LaConte, S. M.; Lainhart, J. E.; Lei, X.; Li, H. J.; Li, K.; Li, K.; Lin, Q.; Liu, D.; Liu, J.; Liu, X.; Liu, Y.; Lu, G.; Lu, J.; Luna, B.; Luo, J.; Lurie, D.; Mao, Y.; Margulies, D. S.; Mayer, A. R.; Meindl, T.; Meyer, M. E.; Nan, W.; Nielsen, J. A.; O’Connor, D.; Paulsen, D.; Prabhakaran, V.; Qi, Z.; Qiu, J.; Shao, C.; Shehzad, Z.; Tang, W.; Villringer, A.; Wang, H.; Wang, K.; Wei, D.; Wei, G. X.; Weng, X. C.; Wu, X.; Xu, T.; Yang, N.; Yang, Z.; Zang, Y. F.; Zhang, L.; Zhang, Q.; Zhang, Z.; Zhang, Z.; Zhao, K.; Zhen, Z.; Zhou, Y.; Zhu, X. T.; Milham, M. P., An open science resource for establishing reliability and reproducibility in functional connectomics, Sci. Data, 1, Article 140049 pp. (2014)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.