×

Accumulative prediction error and the selection of time series models. (English) Zbl 1099.62103

Summary: This article reviews the rationale for using accumulative one-step-ahead prediction errors (APE) as a data-driven method for model selection. Theoretically, APE is closely related to Bayesian model selection and the method of minimum description length (MDL). The sole requirement for using APE is that the models under consideration are capable of generating a prediction for the next, unseen data point. This means that APE may be readily applied to selection problems involving very complex models. APE automatically takes the functional form of parameters into account, and the ‘plug-in’ version of APE does not require the specification of priors.
APE is particularly easy to compute for data that have a natural ordering, such as time series. Here, we explore the possibility of using APE to discriminate the short-range ARMA\((1,1)\) model from the long-range ARFIMA\((0,d,0)\) model. We also illustrate how APE may be used for model meta-selection, allowing one to choose between different model selection methods.

MSC:

62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
65C60 Computational problems in statistics (MSC2010)
62F15 Bayesian inference
62C10 Bayesian problems; characterization of Bayes procedures
62M20 Inference from stochastic processes and prediction
65C05 Monte Carlo methods

Software:

Ox; longmemo; BayesDA
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aitchison, J.; Dunsmore, I. R., Statistical prediction analysis (1975), Cambridge University Press: Cambridge University Press Cambridge · Zbl 0327.62043
[2] Akaike, H., A new look at the statistical model identification, IEEE Transactions on Automatic Control, 19, 716-723 (1974) · Zbl 0314.62039
[3] Baillie, R. T., Long memory processes and fractional integration in econometrics, Journal of Econometrics, 73, 5-59 (1996) · Zbl 0854.62099
[4] Baillie, R. T.; Crato, N.; Ray, B. K., Long-memory forecasting [Special issue], International Journal of Forecasting, 18, 2 (2002)
[5] Bak, P., How nature works: The science of self-organized criticality (1996), Springer: Springer New York · Zbl 0894.00007
[6] Bak, P.; Tang, C.; Wiesenfeld, K., Self-organized criticality: An explanation of \(1 / f\) noise, Physical Review Letters, 59, 381-384 (1987)
[7] Barron, A.; Rissanen, J.; Yu, B., The minimum description length principle in coding and modeling, IEEE Transactions on Information Theory, 44, 2743-2760 (1998) · Zbl 0933.94013
[8] Basak, G. K.; Chan, N. H.; Palma, W., The approximation of long-memory processes by an ARMA model, Journal of Forecasting, 20, 367-389 (2001)
[9] Beran, J., Statistics for long-memory processes (1994), Chapman & Hall: Chapman & Hall New York · Zbl 0869.60045
[10] Bernardo, J. M.; Smith, A. F.M., Bayesian theory (1994), Wiley: Wiley New York · Zbl 0796.62002
[11] Bhansali, R. J., Autoregressive model selection for multistep prediction, Journal of Statistical Planning and Inference, 78, 295-305 (1999) · Zbl 0933.62096
[12] Box, G. E.P.; Jenkins, G. M., Time series analysis: Forecasting and control (1970), Holden Day: Holden Day San Francisco · Zbl 0109.37303
[13] Browne, M., Cross-validation methods, Journal of Mathematical Psychology, 44, 108-132 (2000) · Zbl 0946.62045
[14] Burnham, K. P.; Anderson, D. R., Model selection and multimodel inference: A practical information-theoretic approach (2002), Springer: Springer New York · Zbl 1005.62007
[15] Busemeyer, J. R.; Wang, Y.-M., Model comparisons and model selections based on generalization criterion methodology, Journal of Mathematical Psychology, 44, 171-189 (2000) · Zbl 1048.62506
[16] Chen, Y.; Ding, M.; Kelso, J. A.S., Long memory processes \((1 / f^\alpha\) type) in human coordination, Physical Review Letters, 79, 4501-4504 (1997)
[17] Chen, Y.; Ding, M.; Kelso, J. A.S., Origin of timing errors in human sensorimotor coordination, Journal of Motor Behavior, 33, 3-8 (2001)
[18] Clarke, B., Combining model selection procedures for online prediction, Sankhya A, 63, 229-249 (2001) · Zbl 1004.62073
[19] Crato, N.; Ray, B. K., Model selection and forecasting for long-range dependent processes, Journal of Forecasting, 15, 107-125 (1996)
[20] Dawid, A. P., Statistical theory: The prequential approach, Journal of the Royal Statistical Society A, 147, 278-292 (1984) · Zbl 0557.62080
[21] Dawid, A. P., Fisherian inference in likelihood and prequential frames of reference, Journal of the Royal Statistical Society B, 53, 79-109 (1991) · Zbl 0800.62028
[22] Dawid, A. P., Prequential analysis, stochastic complexity and Bayesian inference, (Bernardo, J. M.; Berger, J. O.; Dawid, A. P.; Smith, A. F.M., Bayesian statistics 4 (1992), Oxford University Press: Oxford University Press Oxford), 109-121 · Zbl 0800.62028
[23] Dawid, A. P.; Vovk, V. G., Prequential probability: Principles and properties, Bernoulli, 5, 125-162 (1999) · Zbl 0929.60001
[24] Delignières, D.; Fortes, M.; Ninot, G., The fractal dynamics of self-esteem and physical self, Nonlinear Dynamics in Psychology and Life Sciences, 8, 479-510 (2004)
[25] De Luna, X.; Skouras, K., Choosing a model selection strategy, Scandinavian Journal of Statistics, 30, 113-128 (2003) · Zbl 1034.62032
[26] de Rooij, S., & Grünwald, P. (2006). An empirical study of minimum description length model selection with infinite parametric complexity. Journal of Mathematical Psychology; de Rooij, S., & Grünwald, P. (2006). An empirical study of minimum description length model selection with infinite parametric complexity. Journal of Mathematical Psychology · Zbl 1098.62008
[27] Ding, M.; Chen, Y.; Kelso, J. A.S., Statistical analysis of timing errors, Brain and Cognition, 48, 98-106 (2002)
[28] (Doornik, J. A., Ox: An object-oriented matrix language (2001), Timberlake Consultants Press: Timberlake Consultants Press London)
[29] Doornik, J. A.; Ooms, M., Computational aspects of maximum likelihood estimation of autoregressive fractionally integrated moving average models, Computational Statistics & Data Analysis, 42, 333-348 (2003) · Zbl 1429.62391
[30] (Doukhan, P.; Oppenheim, G.; Taqqu, M. S., Theory and applications of long-range dependence (2003), Springer: Springer New York) · Zbl 1005.00017
[31] Edwards, W.; Lindman, H.; Savage, L. J., Bayesian statistical inference for psychological research, Psychological Review, 70, 193-242 (1963) · Zbl 0173.22004
[32] Forster, M. R., Key concepts in model selection: Performance and generalizability, Journal of Mathematical Psychology, 44, 205-231 (2000) · Zbl 1048.62500
[33] Gammerman, A., & Vovk, V. (Eds.). (1999). Kolmogorov complexity [Special issue]. The Computer Journal42; Gammerman, A., & Vovk, V. (Eds.). (1999). Kolmogorov complexity [Special issue]. The Computer Journal42 · Zbl 0937.00017
[34] Gelman, A.; Carlin, J. B.; Stern, H. S.; Rubin, D. B., Bayesian data analysis (2004), Chapman & Hall/CRC: Chapman & Hall/CRC Boca Raton, FL · Zbl 1039.62018
[35] Gerencsér, L., On Rissanen’s predictive stochastic complexity for stationary ARMA processes, Journal of Statistical Planning and Inference, 41, 303-325 (1994) · Zbl 0816.62072
[36] Gilden, D. L., Fluctuations in the time required for elementary decisions, Psychological Science, 8, 296-301 (1997)
[37] Gilden, D. L., Cognitive emissions of \(1 / f\) noise, Psychological Review, 108, 33-56 (2001)
[38] Gilden, D. L.; Thornton, T.; Mallon, M. W., \(1 / f\) noise in human cognition, Science, 267, 1837-1839 (1995)
[39] (Gilks, W. R.; Richardson, S.; Spiegelhalter, D. J., Markov chain Monte Carlo in practice (1996), Chapman & Hall/CRC: Chapman & Hall/CRC Boca Raton, FL) · Zbl 0832.00018
[40] Giraitis, L.; Kokoszka, P.; Leipus, R., Testing for long memory in the presence of a general trend, Journal of Applied Probability, 38, 1033-1054 (2001) · Zbl 1140.62341
[41] Gisiger, T., Scale invariance in biology: Coincidence or footprint of a universal mechanism?, Biological Reviews of the Cambridge Philosophical Society, 76, 161-209 (2001)
[42] Good, I. J., Weight of evidence: A brief survey, (Bernardo, J. M.; DeGroot, M. H.; Lindley, D. V.; Smith, A. F.M., Bayesian statistics 2 (1985), Elsevier: Elsevier New York), 249-269 · Zbl 0257.68032
[43] Gottschalk, A.; Bauer, M. S.; Whybrow, P. C., Evidence of chaotic mood variation in bipolar disorder, Archives of General Psychiatry, 52, 947-959 (1995)
[44] Granger, C. W.J.; Joyeux, R., An introduction to long-range time series models and fractional differencing, Journal of Time Series Analysis, 1, 15-30 (1980) · Zbl 0503.62079
[45] Granger, C. W.J.; Morris, M. J., Time series modelling and interpretation, Journal of the Royal Statistical Society A, 139, 246-257 (1976)
[46] Grünwald, P., Model selection based on minimum description length, Journal of Mathematical Psychology, 44, 133-152 (2000) · Zbl 0968.62008
[47] Grünwald, P., MDL tutorial, (Grünwald, P.; Myung, I. J.; Pitt, M. A., Advances in minimum description length: Theory and applications (2005), MIT Press: MIT Press Cambridge, MA)
[48] (Grünwald, P.; Myung, I. J.; Pitt, M. A., Advances in minimum description length: Theory and applications (2005), MIT Press: MIT Press Cambridge, MA)
[49] (Handel, P. H.; Chung, A. L., Noise in physical systems and \(1 / f\) fluctuations (1993), AIP Press: AIP Press New York)
[50] Hansen, M. H.; Yu, B., Model selection and the principle of minimum description length, Journal of the American Statistical Association, 96, 746-774 (2001) · Zbl 1017.62004
[51] Hemerly, E. M.; Davis, M. H.A., Strong consistency of the PLS criterion for order determination of autoregressive processes, The Annals of Statistics, 17, 941-946 (1989) · Zbl 0675.62061
[52] Hjorth, U., Model selection and forward validation, Scandinavian Journal of Statistics, 9, 95-105 (1982) · Zbl 0486.62097
[53] Hosking, J. R.M., Fractional differencing, Biometrika, 68, 165-176 (1981) · Zbl 0464.62088
[54] Hosking, J. R.M., Modeling persistence in hydrological time series using fractional differencing, Water Resources Research, 20, 1898-1908 (1984)
[55] Hurst, H. E., Long-term storage capacity of reservoirs, Transactions of the American Society of Civil Engineers, 116, 770-799 (1951)
[56] Hurvich, C. M.; Tsai, C.-L., Regression and time series model selection in small samples, Biometrika, 76, 297-307 (1989) · Zbl 0669.62085
[57] Jeffreys, H., Theory of probability (1961), Oxford University Press: Oxford University Press Oxford, UK · Zbl 0116.34904
[58] Kass, R. E.; Raftery, A. E., Bayes factors, Journal of the American Statistical Association, 90, 377-395 (1995)
[59] Kontkanen, P.; Myllymäki, P.; Tirri, H., Comparing prequential model selection criteria in supervised learning of mixture models, (Jaakkola, T.; Richardson, T., Proceedings of the eighth international workshop on artificial intelligence and statistics (2001), Morgan Kaufmann Publishers: Morgan Kaufmann Publishers Los Altos, CA), 233-238
[60] Lawrance, A. J.; Kottegoda, N. T., Stochastic modelling of riverflow time series, Journal of the Royal Statistical Society A, 140, 1-47 (1977)
[61] Li, M.; Vitányi, P., An introduction to Kolmogorov complexity and its applications (1997), Springer: Springer New York · Zbl 0866.68051
[62] Mandelbrot, B. B., Fractals: Form, chance, and dimension (1977), Freeman: Freeman San Francisco, CA · Zbl 0376.28020
[63] Modha, D. S.; Masry, E., Memory-universal prediction of stationary random processes, IEEE Transactions on Information Theory, 44, 117-133 (1998) · Zbl 0938.62106
[64] Modha, D. S.; Masry, E., Prequential and cross-validated regression estimation, Machine Learning, 33, 5-39 (1998) · Zbl 0923.62046
[65] Myung, I. J., The importance of complexity in model selection, Journal of Mathematical Psychology, 44, 190-204 (2000) · Zbl 0946.62094
[66] Myung, I. J.; Pitt, M. A., Applying Occam’s razor in modeling cognition: A Bayesian approach, Psychonomic Bulletin & Review, 4, 79-95 (1997)
[67] Novikov, E.; Novikov, A.; Shannahoff-Khalsa, D.; Schwartz, B.; Wright, J., Scale-similar activity in the brain, Physical Review E, 56, R2387-R2389 (1997)
[68] Pagano, M., Estimation of models of autoregressive signal plus white noise, Annals of Statistics, 2, 99-108 (1974) · Zbl 0317.62059
[69] Peterson, B. S.; Leckman, J. F., The temporal dynamics of tics in Gilles de la Tourette syndrome, Biological Psychiatry, 44, 1337-1348 (1998)
[70] Pitt, M. A.; Myung, I. J.; Zhang, S., Toward a method of selecting among computational models of cognition, Psychological Review, 109, 472-491 (2002)
[71] Pressing, J.; Jolley-Rogers, G., Spectral properties of human cognition and skill, Biological Cybernetics, 76, 339-347 (1997) · Zbl 0881.92043
[72] Priestley, M. B., Spectral analysis and time series (1981), Academic Press: Academic Press London · Zbl 0537.62075
[73] Qian, G.; Gabor, G.; Gupta, R. P., Generalised linear model selection by the predictive least quasi-deviance criterion, Biometrika, 83, 41-54 (1996) · Zbl 0865.62050
[74] Raftery, A. E., Bayesian model selection in social research, (Marsden, P. V., Sociological methodology (1995), Blackwells: Blackwells Cambridge, MA), 111-196
[75] Raftery, A. E., Hypothesis testing and model selection, (Gilks, W. R.; Richardson, S.; Spiegelhalter, D. J., Markov chain Monte Carlo in practice (1996), Chapman & Hall/CRC: Chapman & Hall/CRC Boca Raton, FL), 163-187 · Zbl 0841.62019
[76] (Rangarajan, G.; Ding, M., Processes with long-range correlations: Theory and applications (2003), Springer: Springer New York)
[77] Rissanen, J., A predictive least-squares principle, IMA Journal of Mathematical Control and Information, 3, 211-222 (1986) · Zbl 0626.93069
[78] Rissanen, J., Stochastic complexity and modeling, The Annals of Statistics, 14, 1080-1100 (1986) · Zbl 0602.62008
[79] Rissanen, J., Stochastic complexity, Journal of the Royal Statistical Society B, 49, 223-239 (1987) · Zbl 0654.62008
[80] Rissanen, J., Stochastic complexity in statistical inquiry (1989), World Scientific Publishers: World Scientific Publishers Teaneck, NJ · Zbl 0800.68508
[81] Rissanen, J., Discussion of “prequential analysis, stochastic complexity and Bayesian inference” by A. P. Dawid, (Bernardo, J. M.; Berger, J. O.; Dawid, A. P.; Smith, A. F.M., Bayesian statistics 4 (1992), Oxford University Press: Oxford University Press Oxford), 121-122
[82] Rissanen, J., Fisher information and stochastic complexity, IEEE Transactions on Information Theory, 42, 40-47 (1996) · Zbl 0856.94006
[83] Rissanen, J., Hypothesis selection and testing by the MDL principle, The Computer Journal, 42, 260-269 (1999) · Zbl 0937.68059
[84] Rissanen, J., Strong optimality of the normalized ml models as universal codes and information in data, IEEE Transactions on Information Theory, 47, 1712-1717 (2001) · Zbl 0999.94016
[85] Rissanen, J., Complexity of simple nonlogarithmic loss functions, IEEE Transactions on Information Theory, 49, 476-484 (2003) · Zbl 1063.94022
[86] Rissanen, J.; Speed, T.; Yu, B., Density estimation by stochastic complexity, IEEE Transactions on Information Theory, 38, 315-323 (1992) · Zbl 0743.62004
[87] Schwarz, G., Estimating the dimension of a model, Annals of Statistics, 6, 461-464 (1978) · Zbl 0379.62005
[88] Shao, J., Linear model selection by cross-validation, Journal of the American Statistical Association, 88, 422, 286-292 (1993)
[89] Silverman, B. W., Density estimation for statistics and data analysis (1986), Chapman & Hall: Chapman & Hall London · Zbl 0617.62042
[90] Skouras, K.; Dawid, A. P., On efficient point prediction systems, Journal of the Royal Statistical Society B, 60, 765-780 (1998) · Zbl 0909.62086
[91] Sornette, D., Critical phenomena in natural sciences (2000), Springer: Springer Berlin · Zbl 0977.82001
[92] Sowell, F. B., Maximum likelihood estimation of stationary univariate fractionally integrated time series models, Journal of Econometrics, 53, 165-188 (1992)
[93] Sowell, F. B., Modeling long run behavior with the fractional arima model, Journal of Monetary Economics, 29, 277-302 (1992)
[94] Spiegelhalter, D. J.; Best, N. G.; Carlin, B. P.; van der Linde, A., Bayesian measures of model complexity and fit, Journal of the Royal Statistical Society B, 64, 583-639 (2002) · Zbl 1067.62010
[95] Stone, M., Cross-validatory choice and assessment of statistical predictions (with discussion), Journal of the Royal Statistical Society B, 36, 111-147 (1974) · Zbl 0308.62063
[96] Stone, M., Asymptotics for and against cross-validation, Biometrika, 64, 29-35 (1977) · Zbl 0368.62046
[97] Stone, M., An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion, Journal of the Royal Statistical Society B, 39, 44-47 (1977) · Zbl 0355.62002
[98] Thornton, T. L.; Gilden, D. L., Provenance of correlations in psychological data, Psychonomic Bulletin & Review, 12, 409-441 (2005)
[99] Van Orden, G. C.; Holden, J. G.; Turvey, M. T., Self-organization of cognitive performance, Journal of Experimental Psychology: General, 132, 331-350 (2003)
[100] Voss, R. F.; Clarke, J., ‘\(1 / f\)’ noise in music and speech, Nature, 258, 317-318 (1975)
[101] Wagenmakers, E.-J.; Farrell, S.; Ratcliff, R., Estimation and interpretation of \(1 / f^\alpha\) noise in human cognition, Psychonomic Bulletin & Review, 11, 579-615 (2004)
[102] Wagenmakers, E.-J.; Farrell, S.; Ratcliff, R., Human cognition and a pile of sand: A discussion on serial correlations and self-organized criticality, Journal of Experimental Psychology: General, 134, 108-116 (2005)
[103] Wagenmakers, E.-J.; Ratcliff, R.; Gomez, P.; Iverson, G. J., Assessing model mimicry using the parametric bootstrap, Journal of Mathematical Psychology, 48, 28-50 (2004) · Zbl 1076.91537
[104] Wallace, C. S.; Boulton, D. M., An information measure for classification, The Computer Journal, 11, 185-194 (1968) · Zbl 0164.46208
[105] Wallace, C. S.; Freeman, P. R., Estimation and inference by compact coding, Journal of the Royal Statistical Society B, 49, 240-265 (1987) · Zbl 0653.62005
[106] Wei, C. Z., On predictive least squares principles, The Annals of Statistics, 20, 1-42 (1992) · Zbl 0801.62083
[107] Wolf, D., Noise in physical systems (1978), Springer: Springer New York
[108] Yoshinaga, H.; Miyazima, S.; Mitake, S., Fluctuation of biological rhythm in finger tapping, Physica A, 280, 582-586 (2000)
[109] Yulmetyev, R. M.; Emelyanova, N.; Hänggi, P.; Gafarov, F.; Prokhorov, A., Long-range memory and non-Markov statistical effects in human sensorimotor coordination, Physica A, 316, 671-687 (2002) · Zbl 1001.92020
[110] Zhang, P., Model selection via multifold cross-validation, Annals of Statistics, 21, 299-313 (1993) · Zbl 0770.62053
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.