×

zbMATH — the first resource for mathematics

EM algorithms for multivariate Gaussian mixture models with truncated and censored data. (English) Zbl 1255.62308
Summary: We present expectation-maximization (EM) algorithms for fitting multivariate Gaussian mixture models to data that are truncated, censored or truncated and censored. These two types of incomplete measurements are naturally handled together through their relation to the multivariate truncated Gaussian distribution. We illustrate our algorithms on synthetic and flow cytometry data.

MSC:
62N01 Censored data models
62H99 Multivariate analysis
65C60 Computational problems in statistics (MSC2010)
PDF BibTeX Cite
Full Text: DOI
References:
[1] Atkinson, S.E., The performance of standard and hybrid EM algorithms for ML estimates of the normal mixture model with censoring, Journal of statistical computation and simulation, 44, 105-115, (1992)
[2] Biernacki, C.; Celeux, G.; Govaert, G.; Langrognet, F., Model-based cluster and discriminant analysis with the MIXMOD software, Computational statistics and data analysis, 51, 2, 587-600, (2006), URL http://www.mixmod.org/ · Zbl 1157.62431
[3] Boedigheimer, M.J.; Ferbas, J., Mixture modeling approach to flow cytometry data, Cytometry part A, 73, 421-429, (2008)
[4] Brown, M.; Wittwer, C., Flow cytometry: principles and clinical applications in hematology, Clinical chemistry, 46, 1221-1229, (2000)
[5] Cadez, I.V.; Smyth, P.; MacLachlan, G.J.; McLaren, C.E., Maximum likelihood estimation of mixture densities for binned and truncated multivariate data, Machine learning, 47, 7-34, (2002) · Zbl 1012.68057
[6] Chan, C.; Feng, F.; Ottinger, J.; Foster, D.; West, M.; Kepler, T., Statistical mixture modeling for cell subtype identification in flow cytometry, Cytometry part A, 73, 693-701, (2008)
[7] Chauveau, D., A stochastic EM algorithm for mixtures with censored data, Journal of statistical planning and inference, 46, 1-25, (1995) · Zbl 0821.62013
[8] Dempster, A.P.; Laird, N.M.; Rubin, D.B., Maximum likelihood from incomplete data via the EM algorithm, Journal of the royal statistical society. series B (methodological), 39, 1-38, (1977) · Zbl 0364.62022
[9] Drezner, Z.; Wesolowsky, G.O., On the computation of the bivariate normal integral, Journal of statistical computation and simulation, 35, 101-107, (1989)
[10] Gelman, A., Parameterization and Bayesian modeling, Journal of the American statistical association, 99, 466, 537-544, (2004) · Zbl 1117.62343
[11] Genz, A., Numerical computation of rectangular bivariate and trivariate normal and \(t\) probabilities, Statistics and computing, 14, 251-260, (2004)
[12] Genz, A.; Bretz, F., Numerical computation of multivariate \(t\) probabilities with application to power calculation of multiple contrasts, Journal of statistical computation and simulation, 63, 361-378, (1999) · Zbl 0934.62020
[13] Genz, A.; Bretz, F., Comparison of methods for the computation of multivariate \(t\) probabilities, Journal of computational and graphical statistics, 11, 950-971, (2002)
[14] Hastie, T.; Tibshirani, R.; Friedman, J., The elements of statistical learning; data mining, inference and prediction, (2001), Springer Verlag New York · Zbl 0973.62007
[15] Lakoumentas, J.; Drakos, J.; Karakantza, M.; Nikiforidis, G.C.; Sakellaropoulos, G.C., Bayesian clustering of flow cytometry data for the diagnosis of \(B\)-chronic lymphocytic leukemia, Journal of biomedical informatics, 42, 251-261, (2009)
[16] Lo, K.; Brinkman, R.R.; Gottardo, R., Automated gating of flow cytometry data via robust model-based clustering, Cytometry part A, 73, 321-332, (2008)
[17] Magnus, J.R.; Neudecker, H., Matrix differential calculus with applications in statistics and econometrics, (1999), Wiley · Zbl 0912.15003
[18] Manjunath, B.G., Wilhelm, S., 2009. Moments calculation for the double truncated multivariate normal density (working paper). URL SSRN: http://ssrn.com/abstract=1472153.
[19] McLachlan, G.J.; Jones, P.N., Fitting mixture models to grouped and truncated data via the EM algorithm, Biometrics, 44, 571-578, (1988) · Zbl 0707.62214
[20] Pyne, S.; Hu, X.; Wang, K.; Rossin, E.; Lin, T.-I.; Maier, L.M.; Baecher-Allan, C.; McLachlan, G.J.; Tamayo, P.; Hafler, D.A.; Jager, P.L.D.; Mesirov, J.P., Automated high-dimensional flow cytometric data analysis, Pnas, 106, 8519-8524, (2009)
[21] Shapiro, H., Practical flow cytometry, (1994), Wiley-Liss
[22] Tallis, G.M., The moment generating function of the truncated multi-normal distribution, Journal of the royal statistical society. series B (methodological), 23, 223-229, (1961) · Zbl 0107.14206
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.