Higher-order Markov chain models for categorical data sequences. (English) Zbl 1054.62098

Summary: We study higher-order Markov chain models for analyzing categorical data sequences. We propose an efficient estimation method for the model parameters. Data sequences such as DNA and sale demands are used to illustrate the predicting power of our proposed models. In particular, we apply the developed higher-order Markov chain model to the server log data. The objective here is to model the users’ behavior in accessing information and to predict their behavior in the future. Our tests are based on a realistic web log and our model shows an improvement in prediction.


62M05 Markov processes: estimation; hidden Markov models
90C08 Special problems of linear programming (transportation, multi-index, data envelopment analysis, etc.)
60J10 Markov chains (discrete-time Markov processes on discrete state spaces)
62P99 Applications of statistics


Full Text: DOI


[1] and Pre-sending documents on the WWW: A comparative study, Proc Sixteenth Int Joint Conf Artif Intell IJCAI99, 1999, pp. 1274-1279.
[2] Adke, J Roy Statist Soc Ser B 50 pp 105– (1988)
[3] Avery, J Mol Evol 26 pp 335– (1987)
[4] Iterative solution methods, Cambridge University Press, Cambridge, 1996.
[5] and Time series: Theory and methods, Springer-Verlag, New York, 1991. · Zbl 0709.62080
[6] The song of the wood pewee, University of the State of New York, Albany, 1943.
[7] and Linear optimization and extensions, Prentice-Hall, Englewood Cliffs, NJ, 1993.
[8] Gowda, Pattern Recognition 24 pp 567– (1991)
[9] and A cube model for Web access sessions and cluster analysis, WEBKDD 2001, Workshop on Mining Web Log Data Across All Customer Touch Points, Seventh ACM SIGKDD Int Conf Knowledge Discovery Data Mining, August 2001, pp. 47-58.
[10] and Web Watch: A tour guide for the World Wide Web, Proc Fifteenth Int Joint Conf Artif Intell IJCAI 97, 1997, pp. 770-775.
[11] and Some results on the estimation of a higher order Markov chain, Department of Statistics, The University of Hong Kong, Hong Kong, 1989.
[12] Letizia: An agent that assists Web browsing, Proc Fourteenth Int Joint Conf Artif Intell IJCAI 95, 1995, pp. 924-929.
[13] Logan, J Math Sociol 8 pp 75– (1981) · Zbl 0472.92018
[14] and Hidden Markov and other models for discrete-valued time series, Chapman & Hall, London, 1997. · Zbl 0868.60036
[15] Raftery, J Roy Statist Soc Ser B 47 pp 528– (1985)
[16] Raftery, Appl Statist 43 pp 179– (1994)
[17] and INSITE: A tool for real time knowledge discovery from users Web navigation, Proc VLDB2000, Cairo, Egypt, 2000, pp. 635-638.
[18] Introduction to computational biology, Chapman & Hall, Cambridge, 1995.
[19] Yang, J Intell Inform Syst 20 pp 11– (2003)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.