×

Selecting hidden Markov model state number with cross-validated likelihood. (English) Zbl 1224.62039

A hidden Markov chain model is considered in which the states chain is stationary and ergodic, and the observations are conditionally independent given the chain realization. The distributions of the observations are known up to unknown parameters which depend on the hidden states. The authors consider the computation of cross-validated likelihood criteria for selection of the number of the hidden chain states. Two algorithms are proposed. In the first one the training and the test subsamples are taken by considering, respectively, the odd and even indices of the original data set. In the second algorithm the elements of the test subsample are chosen at random and considered as missing data in the training subsample. A version of the EM algorithm is used for fitting the model by the training subsample. Numerical results are presented for simulated and biological real life data.

MSC:

62M09 Non-Markovian processes: estimation
62-08 Computational methods for problems pertaining to statistics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Akaike H (1973). Information theory as an extension of the maximum likelihood theory. In: Petrov, BN and Csaki, F (eds) Second International Symposium on Information Theory, pp 267–281. Akademiai Kiado, Budapest · Zbl 0283.62006
[2] Baum LE, Petrie T, Soules G and Weiss N (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41(1): 164–171 · Zbl 0188.49603 · doi:10.1214/aoms/1177697196
[3] Bernardo JM and Smith AFM (1994). Bayesian theory. Wiley, Chichester
[4] Biernacki C, Celeux G and Govaert G (2001). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intel 22(7): 719–725 · doi:10.1109/34.865189
[5] Biernacki C, Celeux G and Govaert G (2003). Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3–4): 561–575 · Zbl 1429.62235 · doi:10.1016/S0167-9473(02)00163-9
[6] Boucheron S, Gassiat E (2005) Inference in hidden Markov models, chapter order estimation. In: Cappé O, Moulines E, Rydén T (eds) Springer, Heidelberg · Zbl 1065.62148
[7] Celeux G, Clairambault J (1992) Estimation de chaînes de Markov cachées : méthodes et problèmes. In: Actes des journées thématiques Approches markoviennes en signal et images. GDR signal-images CNRS, pp 5–20
[8] Churchill GA (1989). Stochastic models for heterogeneous DNA sequences. Bull Math Biol 51: 79–94 · Zbl 0662.92012
[9] Clairambault J, Curzi-Dascalova L, Kauffmann F, Médigue C and Leffler C (1992). Heart rate variability in normal sleeping full-term and preterm neonates. Early Human Dev 28: 169–183 · doi:10.1016/0378-3782(92)90111-S
[10] Dempster AP, Laird NM and Rubin DB (1977). Maximum likelihood from incomplete data via the EM Algorithm. J R Stat Soc Ser B 39: 1–38 · Zbl 0364.62022
[11] Devijver PA (1985). Baum’s forward–backward Algorithm revisited. Pattern Recogn Lett 3: 369–373 · Zbl 0593.62083 · doi:10.1016/0167-8655(85)90023-6
[12] Durand J-B (2003) Modèles à structure cachée : inférence, sction de modèles et applications (in French). Ph.D. thesis, Université Grenoble 1 - Joseph Fourier
[13] Ephraim Y and Merhav N (2002). Hidden Markov processes. IEEE Trans Inform Theory 48: 1518–1569 · Zbl 1061.94560 · doi:10.1109/TIT.2002.1003838
[14] Fraley C and Raftery AE (2002). Model-based clustering, discriminant Analysis and density estimation. J Am Stat Assoc 97: 611–631 · Zbl 1073.62545 · doi:10.1198/016214502760047131
[15] Gassiat E (2002). Likelihood ratio inequalities with application to various mixtures. Ann Inst Henri Poincaré 38: 897–906 · Zbl 1011.62025 · doi:10.1016/S0246-0203(02)01125-1
[16] Gassiat E and Kéribin C (2000). The likelihood ratio test for the number of components in a mixture with Markov regime. ESAIM P S 4: 25–52 · Zbl 0982.62016 · doi:10.1051/ps:2000102
[17] Kass RE and Raftery AE (1995). Bayes factors. J Am Stat Assoc 90(430): 773–795 · Zbl 0846.62028 · doi:10.1080/01621459.1995.10476572
[18] Kéribin C (2000). Consistent estimation of the order of mixture models. Sankhya Ser A 62: 49–66
[19] McLachlan GJ and Peel D (1997). On a resampling approach to choosing the number of components in normal mixture models. In: Billard, L and Fisher, NI (eds) Computing science and statistics, vol 28, pp 260–266. Interface Foundation of North America, Fairfax Station
[20] McLachlan GJ and Peel D (2000). Finite mixture models. Wiley Series in probability and statistics. Wiley, London
[21] Rabiner LR (1989). A tutorial on hidden Markov models and selected Applications in speech recognition. Proc IEEE 77: 257–286 (February) · doi:10.1109/5.18626
[22] Redner RA and Walker HF (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26(2): 195–239 · Zbl 0536.62021 · doi:10.1137/1026034
[23] Ripley BD (1996). Pattern recognition and neural networks. Cambridge University Press, London · Zbl 0853.62046
[24] Robert CP, Celeux G and Diebolt J (1993). Bayesian estimation of hidden Markov chains: A stochastic implementation. Stat Probab Lett 16(1): 77–83 · Zbl 0783.62062 · doi:10.1016/0167-7152(93)90127-5
[25] Robertson AW, Kirshner S and Smyth P (2004). Downscaling of daily rainfall occurence over Northeast Brazil using a hidden Markov model. J Clim 17(7): 4407–4424 · doi:10.1175/JCLI-3216.1
[26] Roeder K and Wasserman L (1997). Practical Bayesian density estimation using mixtures of normals. J Am Stat Assoc 92(439): 894–902 · Zbl 0889.62021 · doi:10.1080/01621459.1997.10474044
[27] Schwarz G (1978). Estimating the dimension of a model. Ann Stat 6: 461–464 · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[28] Smyth P (2000). Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 10(1): 63–72 · doi:10.1023/A:1008940618127
[29] Spiegelhalter DJ, Best NG and Carlin BP (2000). Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Ser B 64(4): 583–639 · Zbl 1067.62010 · doi:10.1111/1467-9868.00353
[30] Yang Y (2005). Can the strengths of AIC and BIC be shared? A confict between model identification and regression estimation. Biometrika 92: 937–950 · Zbl 1151.62301 · doi:10.1093/biomet/92.4.937
[31] Zhang P (1993). Model selection via multifold cross validation. Ann Stat 21(1): 299–313 · Zbl 0770.62053 · doi:10.1214/aos/1176349027
[32] Zhang NR and Siegmund DO (2007). A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics 63(1): 22–32 · Zbl 1206.62174 · doi:10.1111/j.1541-0420.2006.00662.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.