zbMATH — the first resource for mathematics

Exploring the state sequence space for hidden Markov and semi-Markov chains. (English) Zbl 1161.62412
Summary: The knowledge of the state sequences that explain a given observed sequence for a known hidden Markovian model is the basis of various methods that may be divided into three categories: (i) enumeration of state sequences; (ii) summary of the possible state sequences in state profiles; (iii) computation of a global measure of the state sequence uncertainty. Concerning the first category, the generalized Viterbi algorithm for computing the top \(L\) most probable state sequences and the forward-backward algorithm for sampling state sequences are derived for hidden semi-Markov chains and hidden hybrid models combining Markovian and semi-Markovian states. Concerning the second category, a new type of state (and state change) profiles is proposed. The Viterbi forward-backward algorithm for computing these state profiles is derived for hidden semi-Markov chains and hidden hybrid models combining Markovian and semi-Markovian states. Concerning the third category, an algorithm for computing the entropy of the state sequence that explains an observed sequence is proposed. The complementarity and properties of these methods for exploring the state sequence space (including the classical state profiles computed by the forward-backward algorithm) are investigated and illustrated with examples.

62M99 Inference from stochastic processes
65C40 Numerical analysis or methods applied to Markov chains
65C60 Computational problems in statistics (MSC2010)
PDF BibTeX Cite
Full Text: DOI
[1] Brushe, G.D.; Mahony, R.E.; Moore, J.B., A soft output hybrid algorithm for ML/MAP sequence estimation, IEEE trans. inform. theory, 44, 7, 3129-3134, (1998) · Zbl 0932.94007
[2] Burge, C.; Karlin, S., Prediction of complete gene structures in human genomic DNA, J. mol. biol., 268, 78-94, (1997)
[3] Cawley, S.L.; Pachter, L., HMM sampling and applications to gene finding and alternative splicing, Bioinformatics, 19, Suppl. 2, ii36-ii41, (2003)
[4] Chib, S., Calculating posterior distributions and modal estimates in Markov mixture models, J. econometrics, 75, 79-97, (1996) · Zbl 0864.62010
[5] Churchill, G.A., Stochastic models for heterogeneous DNA sequences, Bull. math. biol., 51, 1, 79-94, (1989) · Zbl 0662.92012
[6] Cover, T.M.; Thomas, J.A., Elements of information theory, (1991), Wiley New York · Zbl 0762.94001
[7] Dempster, A.P.; Laird, N.M.; Rubin, D.B., Maximum likelihood from incomplete data via the EM algorithm (with discussion), J. roy. statist. soc. ser. B, 39, 1, 1-38, (1977) · Zbl 0364.62022
[8] Ephraim, Y.; Merhav, N., Hidden Markov processes, IEEE trans. inform. theory, 48, 6, 1518-1569, (2002) · Zbl 1061.94560
[9] Ferguson, J.D., Variable duration models for speech, (), 143-179
[10] Foreman, L.A., Generalization of the viterbi algorithm, IMA J. math. appl. bus. ind., 4, 351-367, (1993) · Zbl 0788.90073
[11] Guédon, Y., Estimating hidden semi-Markov chains from discrete sequences, J. comput. graphical statist., 12, 3, 604-639, (2003)
[12] Guédon, Y., Hidden hybrid Markov/semi-Markov chains, Comput. statist. data anal., 49, 3, 663-688, (2005) · Zbl 1429.62366
[13] Guédon, Y.; Barthélémy, D.; Caraglio, Y.; Costes, E., Pattern analysis in branching and axillary flowering sequences, J. theor. biol., 212, 481-520, (2001)
[14] Hernando, D.; Crespi, V.; Cybenko, G., Efficient computation of the hidden Markov model entropy for a given observation sequence, IEEE trans. inform. theory, 51, 7, 2681-2685, (2005) · Zbl 1310.94033
[15] Heuret, P.; Guédon, Y.; Guérard, N.; Barthélémy, D., Analysing branching pattern in plantations of Young red oak trees (quercus rubra L., fagaceae), Ann. bot., 91, 4, 479-492, (2003)
[16] Kulkarni, V.G., Modeling and analysis of stochastic systems, (1995), Chapman & Hall London · Zbl 0866.60004
[17] Lukashin, A.V.; Borodovsky, M., Genemark.hmm: new solutions for gene finding, Nucleic acids res., 26, 4, 1107-1115, (1998)
[18] McLachlan, G.J.; Krishnan, T., The EM algorithm and extensions, (1997), Wiley New York · Zbl 0882.62012
[19] Sansom, J.; Thomson, P., Fitting hidden semi-Markov models to breakpoint rainfall data, J. appl. probab., 38A, 142-157, (2001) · Zbl 1008.62110
[20] Schmidler, S.C.; Liu, J.S.; Brutlag, D.L., Bayesian segmentation of protein secondary structure, J. comput. biol., 7, 1/2, 233-248, (2000)
[21] Scott, S.L., Bayesian methods for hidden Markov models: recursive computing in the 21st century, J. amer. statist. assoc., 97, 457, 337-351, (2002) · Zbl 1073.65503
[22] Tanner, M.A., Tools for statistical inference: methods for the exploration of posterior distributions and likelihood functions, (1996), Springer New York · Zbl 0846.62001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.