Stochastic EM algorithms for parametric and semiparametric mixture models for right-censored lifetime data.

*(English)*Zbl 1348.65016Summary: Mixture models in reliability bring a useful compromise between parametric and nonparametric models, when several failure modes are suspected. The classical methods for estimation in mixture models rarely handle the additional difficulty coming from the fact that lifetime data are often censored, in a deterministic or random way. We present in this paper several iterative methods based on EM and Stochastic EM methodologies, that allow us to estimate parametric or semiparametric mixture models for randomly right censored lifetime data, provided they are identifiable. We consider different levels of completion for the (incomplete) observed data, and provide genuine or EM-like algorithms for several situations. In particular, we show that simulating the missing data coming from the mixture allows to plug a standard R package for survival data analysis in an EM algorithm’s M-step. Moreover, in censored semiparametric situations, a stochastic step is the only practical solution allowing computation
of nonparametric estimates of the unknown survival function. The effectiveness of the new proposed algorithms are demonstrated in simulation studies and an actual dataset example from aeronautic industry.

##### MSC:

65C60 | Computational problems in statistics (MSC2010) |

62F10 | Point estimation |

62G05 | Nonparametric estimation |

62F12 | Asymptotic properties of parametric estimators |

62N01 | Censored data models |

##### Keywords:

censored data; stochastic EM algorithm; finite mixture; reliability; semiparametric mixtures; survival data
PDF
BibTeX
XML
Cite

\textit{L. Bordes} and \textit{D. Chauveau}, Comput. Stat. 31, No. 4, 1513--1538 (2016; Zbl 1348.65016)

Full Text:
DOI

##### References:

[1] | Andersen P, Borgan O, Gill R, Keiding N (1993) Statistical models based on counting processes. Springer, New York · Zbl 0769.62061 |

[2] | Atkinson, SE, The performance of standard and hybrid EM algorithms for ML estimates of the normal mixture model with censoring, J Stat Comput Simul, 44, 105-115, (1992) |

[3] | Balakrishnan, N; Mitra, D, Likelihood inference for lognormal data with left truncation and right censoring with illustration, J Stat Plan Inference, 144, 3536-3553, (2011) · Zbl 1221.62038 |

[4] | Balakrishnan, N; Mitra, D, EM-based likelihood inference for some lifetime distributions based on left truncated and right censored data and associated model discrimination, S Afr Stat J, 48, 125-171, (2014) · Zbl 1397.62365 |

[5] | Benaglia, T; Chauveau, D; Hunter, DR, An EM-like algorithm for semi-and non-parametric estimation in multivariate mixtures, J Comput Graph Stat, 18, 505-526, (2009) |

[6] | Benaglia, T; Chauveau, D; Hunter, DR; Young, D, Mixtools: an R package for analyzing finite mixture models, J Stat Softw, 32, 1-29, (2009) |

[7] | Beutner, E; Bordes, L, Estimators based on data-driven generalized weighted cramer-von Mises distances under censoring-with applications to mixture models, Scand J Stat, 38, 108-129, (2011) · Zbl 1246.62080 |

[8] | Bordes, L; Chauveau, D, Comments: EM-based likelihood inference for some lifetime distributions based on left truncated and right censored data and associated model discrimination, S Afr Stat J, 48, 197-200, (2014) · Zbl 1397.62367 |

[9] | Bordes, L; Chauveau, D; Vandekerkhove, P, A stochastic EM algorithm for a semiparametric mixture model, Comput Stat Data Anal, 51, 5429-5443, (2007) · Zbl 1445.62056 |

[10] | Bordes, L; Mottelet, S; Vandekerkhove, P, Semiparametric estimation of a two-component mixture model, Ann Stat, 34, 1204-1232, (2006) · Zbl 1112.62029 |

[11] | Cao, R; Janssen, P; Veraverbeke, N, Relative density estimation and local bandwidth selection for censored data, Comput Stat Data Anal, 36, 497-510, (2001) · Zbl 1030.62027 |

[12] | Castet, J-F; Saleh, JH, Single versus mixture Weibull distributions for nonparametric satellite reliability, Reliab Eng Syst Saf, 95, 295-300, (2010) |

[13] | Cavanaugh, JE; Shumway, RH, An Akaike information criterion for model selection in the presence of incomplete data, J Stat Plan Inference, 67, 45-65, (1998) · Zbl 1067.62504 |

[14] | Celeux, G; Chauveau, D; Diebolt, J, Stochastic versions of the EM algorithm: an experimental study in the mixture case, J Stat Comput Simul, 55, 287-314, (1996) · Zbl 0907.62024 |

[15] | Celeux, G; Diebolt, J, The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem, Comput Stat Q, 2, 73-82, (1986) |

[16] | Chauveau, D, A stochastic EM algorithm for mixtures with censored data, J Stat Plan Inference, 46, 1-25, (1995) · Zbl 0821.62013 |

[17] | Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J R Stat Soc Ser B (Methodological), 39, 1-38, (1977) · Zbl 0364.62022 |

[18] | Dirick, L; Claeskens, G; Baesens, B, An Akaike information criterion for multiple event mixture cure models, Eur J Oper Res, 241, 449-457, (2015) · Zbl 1341.62076 |

[19] | Dubos, GF; Castet, J-F; Saleh, JH, Statistical reliability analysis of satellites by mass category: does spacecraft size matter?, Acta Astronaut, 67, 584-595, (2010) |

[20] | Hunter, DR; Wang, S; Hettmansperger, TP, Inference for mixtures of symmetric distributions, Ann Stat, 35, 224-251, (2007) · Zbl 1114.62035 |

[21] | Karunamuni, R; Wu, J, Minimum Hellinger distance estimation in a nonparametric mixture model, J Stat Plan Inference, 3, 1118-1133, (2009) · Zbl 1156.62024 |

[22] | Lee, G; Scott, C, EM algorithms for multivariate Gaussian mixture models with truncated and censored data, Comput Stat Data Anal, 56, 2816-2829, (2012) · Zbl 1255.62308 |

[23] | Louis, T, Finding the observed information matrix when using the EM algorithm, J R Stat Soc Ser B, 44, 226-233, (1982) · Zbl 0488.62018 |

[24] | McLachlan G, Peel D (2000) Finite mixture models: Wiley series in probability and statistics: applied probability and statistics. Wiley-Interscience, New York · Zbl 0963.62061 |

[25] | McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions: Wiley series in probability and statistics: applied probability and statistics. Wiley-Interscience, New York |

[26] | Nielsen, SF, The stochastic EM algorithm: estimation and asymptotic results, Bernoulli, 6, 457-489, (2000) · Zbl 0981.62022 |

[27] | R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria |

[28] | Suzukawa, A; Imai, H; Sato, Y, Kullback-Leibler information consistent estimation for censored data, Ann Inst Stat Math, 53, 262-276, (2001) · Zbl 1008.62035 |

[29] | Svensson, I; Sjöstedt-de Luna, S, Asymptotic properties of a stochastic EM algorithm for mixtures with censored data, J Stat Plan Inference, 140, 111-127, (2010) · Zbl 1178.62020 |

[30] | Therneau T, Lumley T (2009) survival: Survival analysis, including penalised likelihood. R package version 2.35-8 |

[31] | Wei, G; Tanner, M, A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithm, J Am Stat Assoc, 85, 699-704, (1990) |

[32] | Yu H (2012) Rmpi: Interface (Wrapper) to MPI (Message-Passing Interface) · Zbl 0981.62022 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.