zbMATH — the first resource for mathematics

Consistency and asymptotic normality of latent block model estimators. (English) Zbl 1439.62256
Summary: The Latent Block Model (LBM) is a model-based method to cluster simultaneously the \(d\) columns and \(n\) rows of a data matrix. Parameter estimation in LBM is a difficult and multifaceted problem. Although various estimation strategies have been proposed and are now well understood empirically, theoretical guarantees about their asymptotic behavior is rather sparse and most results are limited to the binary setting. We prove here theoretical guarantees in the valued settings. We show that under some mild conditions on the parameter space, and in an asymptotic regime where \(\log (d)/n\) and \(\log (n)/d\) tend to \(0\) when \(n\) and \(d\) tend to infinity, (1) the maximum-likelihood estimate of the complete model (with known labels) is consistent and (2) the log-likelihood ratios are equivalent under the complete and observed (with unknown labels) models. This equivalence allows us to transfer the asymptotic consistency, and under mild conditions, asymptotic normality, to the maximum likelihood estimate under the observed model. Moreover, the variational estimator is also consistent and, under the same conditions, asymptotically normal.

62R07 Statistical aspects of big data and data science
62H20 Measures of association (correlation, canonical correlation, etc.)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI Euclid
[1] Christophe Ambroise and Catherine Matias. New consistent and asymptotically normal parameter estimates for random-graph mixture models., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74(1):3-35, 2012. · Zbl 1411.62051
[2] Peter Bickel, David Choi, Xiangyu Chang, Hai Zhang, et al. Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels., The Annals of Statistics, 41(4) :1922-1943, 2013. · Zbl 1292.62042
[3] Peter J. Bickel and Aiyou Chen. A nonparametric view of network models and newman-girvan and other modularities., Proceedings of the National Academy of Sciences, 106(50) :21068-21073, 2009. · Zbl 1359.62411
[4] Alain Celisse, Jean-Jacques Daudin, Laurent Pierre, et al. Consistency of maximum-likelihood and variational estimators in the stochastic block model., Electronic Journal of Statistics, 6 :1847-1899, 2012. · Zbl 1295.62028
[5] Gérard Govaert and Mohamed Nadif. Clustering with block mixture models., Pattern Recognition, 36(2):463-473, 2003. · Zbl 1452.62444
[6] Gérard Govaert and Mohamed Nadif. Block clustering with Bernoulli mixture models: Comparison of different approaches., Computational Statistics & Data Analysis, 52(6) :3233-3245, 2008. · Zbl 1452.62444
[7] Gérard Govaert and Mohamed Nadif. Latent block model for contingency table., Communications in Statistics—Theory and Methods, 39(3):416-425, 2010. · Zbl 1187.62117
[8] Gérard Govaert and Mohamed Nadif., Co-clustering. John Wiley & Sons, 2013.
[9] Christine Keribin, Vincent Brault, Gilles Celeux, and Gérard Govaert. Estimation and selection for the latent block model on categorical data., Statistics and Computing, 25(6) :1201-1216, 2015. · Zbl 1331.62149
[10] Mahendra Mariadassou and Catherine Matias. Convergence of the groups posterior distribution in latent or stochastic block models., Bernoulli, 21(1):537-573, 2015. · Zbl 1329.62285
[11] Pascal Massart., Concentration inequalities and model selection, volume 6. Springer, 2007. · Zbl 1170.60006
[12] J. G. Shanthikumar and U. Sumita. A central limit theorem for random sums of random variables., Operations Research Letters, 3(3):153-155, 1984. · Zbl 0546.60023
[13] Tom A. B. Snijders and Krzysztof Nowicki. Estimation and prediction for stochastic blockmodels for graphs with latent block structure., Journal of Classification, 14(1):75-100, Jan 1997. · Zbl 0896.62063
[14] Martin J. Wainwright., High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge University Press, 2019. · Zbl 1457.62011
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.