×

Asymptotics in directed exponential random graph models with an increasing bi-degree sequence. (English) Zbl 1331.62110

Summary: Although asymptotic analyses of undirected network models based on degree sequences have started to appear in recent literature, it remains an open problem to study statistical properties of directed network models. In this paper, we provide for the first time a rigorous analysis of directed exponential random graph models using the in-degrees and out-degrees as sufficient statistics with binary as well as continuous weighted edges. We establish the uniform consistency and the asymptotic normality for the maximum likelihood estimate, when the number of parameters grows and only one realized observation of the graph is available. One key technique in the proofs is to approximate the inverse of the Fisher information matrix using a simple matrix with high accuracy. Numerical studies confirm our theoretical findings.

MSC:

62F10 Point estimation
62F12 Asymptotic properties of parametric estimators
62B05 Sufficient statistics and fields
62E20 Asymptotic distribution theory in statistics
05C80 Random graphs (graph-theoretic aspects)

Software:

MCODE
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Adamic, L. A. and Glance, N. (2005). The political blogosphere and the 2004 US Election: Divided they blog. In Proceedings of the 3 rd International Workshop on Link Discovery 36-43. ACM, New York.
[2] Akoglu, L., Vaz de Melo, P. O. S. and Faloutsos, C. (2012). Quantifying reciprocity in large weighted communication networks. Advances in Knowledge Discovery and Data Mining , Lecture Notes in Computer Science 7302 85-96.
[3] Bader, G. D. and Hogue, C. W. V. (2003). An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4 2-27.
[4] Barndorff-Nielsen, O. (1973). Exponential families and conditioning. Ph.D. thesis, Univ. of Copenhagen. · Zbl 0297.62001
[5] Berk, R. H. (1972). Consistency and asymptotic normality of MLE’s for exponential models. Ann. Mat. Statist. 43 193-204. · Zbl 0253.62005 · doi:10.1214/aoms/1177692713
[6] Bickel, P. J., Chen, A. and Levina, E. (2011). The method of moments and degree distributions for network models. Ann. Statist. 39 2280-2301. · Zbl 1232.91577 · doi:10.1214/11-AOS904
[7] Bolla, M. and Elbanna, A. (2014). Estimating parameters of a multipartite loglinear graph model via the EM algorithm. Preprint. Available at . arXiv:1411.7934 · Zbl 1403.62094
[8] Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324-345. · Zbl 0047.12903
[9] Chatterjee, S. and Diaconis, P. (2013). Estimating and understanding exponential random graph models. Ann. Statist. 41 2428-2461. · Zbl 1293.62046 · doi:10.1214/13-AOS1155
[10] Chatterjee, S., Diaconis, P. and Sly, A. (2011). Random graphs with a given degree sequence. Ann. Appl. Probab. 21 1400-1435. · Zbl 1234.05206 · doi:10.1214/10-AAP728
[11] Chen, N. and Olvera-Cravioto, M. (2013). Directed random graphs with given degree distributions. Stoch. Syst. 3 1-40. · Zbl 1297.05212 · doi:10.1214/12-SSY076
[12] Diesner, J. and Carley, K. M. (2005). Exploration of communication networks from the Enron email corpus. In Proceedings of Workshop on Link Analysis , Counterterrorism and Security , SIAM International Conference on Data Mining 3-14. SIAM, Philadelphia, PA. · Zbl 1108.91346
[13] Erdős, P. L., Miklós, I. and Toroczkai, Z. (2010). A simple Havel-Hakimi type algorithm to realize graphical degree sequences of directed graphs. Electron. J. Combin. 17 Research Paper 66, 10. · Zbl 1215.05035
[14] Fienberg, S. E. (2012). A brief history of statistical models for network analysis and open challenges. J. Comput. Graph. Statist. 21 825-839. · doi:10.1080/10618600.2012.738106
[15] Fienberg, S. E., Petrović, S. and Rinaldo, A. (2011). Algebraic statistics for \(p_{1}\) random graph models: Markov bases and their uses. In Looking Back. Lect. Notes Stat. Proc. (N. J. Dorans and S. Sinharay, eds.) 202 21-38. Springer, New York. · doi:10.1007/978-1-4419-9389-2_2
[16] Fienberg, S. E. and Rinaldo, A. (2012). Maximum likelihood estimation in log-linear models. Ann. Statist. 40 996-1023. · Zbl 1274.62389 · doi:10.1214/12-AOS986
[17] Fienberg, S. E. and Wasserman, S. (1981). An exponential family of probability distributions for directed graphs: Comment. J. Amer. Statist. Assoc. 76 54-57. · Zbl 0457.62090 · doi:10.2307/2287037
[18] Fienberg, S. E. and Wasserman, S. S. (1981). Categorical data analysis of single sociometric relations. Sociol. Method. 1981 156-192.
[19] Fischer, G. H. (1981). On the existence and uniqueness of maximum-likelihood estimates in the Rasch model. Psychometrika 46 59-77. · Zbl 0465.62103 · doi:10.1007/BF02293919
[20] Girvan, M. and Newman, M. E. J. (2002). Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99 7821-7826 (electronic). · Zbl 1032.91716 · doi:10.1073/pnas.122653799
[21] Haberman, S. J. (1977). Maximum likelihood estimates in exponential response models. Ann. Statist. 5 815-841. · Zbl 0368.62019 · doi:10.1214/aos/1176343941
[22] Haberman, S. J. (1981). An exponential family of probability distributions for directed graphs: Comment. J. Amer. Statist. Assoc. 76 60-61. · Zbl 0457.62090 · doi:10.2307/2287037
[23] Handcock, M. S. (2003). Assessing degeneracy in statistical models of social networks, Working Paper 39. Technical report, Center for Statistics and the Social Sciences, Univ. Washington, Seattle, WA.
[24] Helleringer, S. and Kohler, H.-P. (2007). Sexual network structure and the spread of HIV in Africa: Evidence from Likoma Island, Malawi. AIDS 21 2323-2332.
[25] Hillar, C. and Wibisono, A. (2013). Maximum entropy distributions on graphs. Preprint. Available at . arXiv:1301.3321
[26] Holland, P. W. and Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. J. Amer. Statist. Assoc. 76 33-65. · Zbl 0457.62090 · doi:10.2307/2287037
[27] Hunter, D. R. and Handcock, M. S. (2006). Inference in curved exponential family models for networks. J. Comput. Graph. Statist. 15 565-583. · doi:10.1198/106186006X133069
[28] Kantorovič, L. V. (1948). On Newton’s method for functional equations. Dokl. Akad. Nauk SSSR 59 1237-1240.
[29] Kim, H., Del Genio, C. I., Bassler, K. E. and Toroczkai, Z. (2012). Constructing and sampling directed graphs with given degree sequences. New J. Phys. 14 023012.
[30] Kossinets, G. and Watts, D. J. (2006). Empirical analysis of an evolving social network. Science 311 88-90. · Zbl 1226.91055 · doi:10.1126/science.1116869
[31] Loève, M. (1977). Probability Theory. I , 4th ed. Springer, New York.
[32] Nepusz, T., Yu, H. and Paccanaro, A. (2012). Detecting overlapping protein complexes in protein-protein interaction networks. Nat. Methods 18 471-472.
[33] Newman, M. E. J. (2002). Spread of epidemic disease on networks. Phys. Rev. E (3) 66 016128, 11.
[34] Olhede, S. C. and Wolfe, P. J. (2012). Degree-based network models. Preprint. Available at . arXiv:1211.6537
[35] Ortega, J. M. (1968). The Newton-Kantorovich theorem. Amer. Math. Monthly 75 658-660. · Zbl 0183.43004 · doi:10.2307/2313800
[36] Ortega, J. M. and Rheinboldt, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables . Academic Press, New York. · Zbl 0241.65046
[37] Petrović, S., Rinaldo, A. and Fienberg, S. E. (2010). Algebraic statistics for a directed random graph model with reciprocation. In Algebraic Methods in Statistics and Probability II. Contemp. Math. 516 (M. A. G. Vianaand and H. P. Wynn, eds.) 261-283. Amer. Math. Soc., Providence, RI. · Zbl 1197.13025 · doi:10.1090/conm/516/10180
[38] Polyak, B. T. (2004). Newton-Kantorovich method and its global convergence. J. Math. Sci. 133 1513-1523. · Zbl 1080.65534 · doi:10.1007/s10958-006-0066-1
[39] Rinaldo, A., Petrović, S. and Fienberg, S. E. (2013). Maximum likelihood estimation in the \(\beta\)-model. Ann. Statist. 41 1085-1110. · Zbl 1292.62052 · doi:10.1214/12-AOS1078
[40] Robins, G. and Pattison, P. (2007). An introduction to exponential random graph (\(p^{*}\)) models for social networks. Soc. Netw. 29 173-191.
[41] Robins, G., Pattison, P. and Wang, P. (2009). Closure, connectivity and degree distributions: Exponential random graph (\(p^{*}\)) models for directed social networks. Soc. Netw. 31 105-117.
[42] Robins, G. L., Snijders, T. A. B., Wang, P., Handcock, M. and Pattison, P. (2007). Recent developments in exponential random graph (\(p^{*}\)) models for social networks. Soc. Netw. 29 192-215.
[43] Salathéa, M., Kazandjievab, M., Leeb, J. W., Levisb, P., Marcus, Feldman, M. W. and Jones, J. H. (2010). A high-resolution human contact network for infectious disease transmission. Proc. Natl. Acad. Sci. USA 107 22020-22025.
[44] Schweinberger, M. (2011). Instability, sensitivity, and degeneracy of discrete exponential families. J. Amer. Statist. Assoc. 106 1361-1370. · Zbl 1233.62020 · doi:10.1198/jasa.2011.tm10747
[45] Shalizi, C. R. and Rinaldo, A. (2013). Consistency under sampling of exponential random graph models. Ann. Statist. 41 508-535. · Zbl 1269.91066 · doi:10.1214/12-AOS1044
[46] Simons, G. and Yao, Y.-C. (1999). Asymptotics when the number of parameters tends to infinity in the Bradley-Terry model for paired comparisons. Ann. Statist. 27 1041-1060. · Zbl 0951.62061 · doi:10.1214/aos/1018031267
[47] Tapia, R. A. (1971). Classroom Notes: The Kantorovich theorem for Newton’s method. Amer. Math. Monthly 78 389-392. · Zbl 0215.27404 · doi:10.2307/2316909
[48] von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S. G., Fields, S. and Bork, P. (2002). Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417 399-403.
[49] Wainwright, M. and Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Faund. Trends Mach. Learn. 1 1-305. · Zbl 1193.62107 · doi:10.1561/2200000001
[50] Wu, N. (1997). The Maximum Entropy Method . Springer, Berlin. · Zbl 0872.65117
[51] Yan, T., Leng, C. and Zhu, J. (2015). Supplement to “Asymptotics in directed exponential random graph models with an increasing bi-degree sequence.” . · Zbl 1331.62110 · doi:10.1214/15-AOS1343
[52] Yan, T. and Leng, C. (2015). A simulation study of the \(p_{1}\) model for directed random graphs. Stat. Interface 8 255-266. · Zbl 1386.05179 · doi:10.4310/SII.2015.v8.n3.a1
[53] Yan, T. and Xu, J. (2013). A central limit theorem in the \(\beta\)-model for undirected random graphs with a diverging number of vertices. Biometrika 100 519-524. · Zbl 1452.62214 · doi:10.1093/biomet/ass084
[54] Yan, T., Zhao, Y. and Qin, H. (2015). Asymptotic normality in the maximum entropy models on graphs with an increasing number of parameters. J. Multivariate Anal. 133 61-76. · Zbl 1304.62038 · doi:10.1016/j.jmva.2014.08.013
[55] Zhao, Y., Levina, E. and Zhu, J. (2012). Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Statist. 40 2266-2292. · Zbl 1257.62095 · doi:10.1214/12-AOS1036
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.