Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. (English) Zbl 0627.62005

During the last fifteen years, Akaike’s entropy-based information criterion (AIC) has had a fundamental impact in statistical model evaluation problems. This paper studies the general theory of the AIC procedure and provides its analytical extensions in two ways without violating Akaike’s main principles. These extensions make AIC asymptotically consistent and penalize overparameterization more stringently to pick only the simplest of the “true” models. These selection criteria are called CAIC and CAICF. Asymptotic properties of AIC and its extensions are investigated, and empirical performances of these criteria are studied in choosing the correct degree of a polynomial model in two different Monto Carlo experiments under different conditions.


62B10 Statistical aspects of information-theoretic topics
Full Text: DOI


[1] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & B. F. Csaki (Eds.),Second International Symposium on Information Theory, (pp. 267–281). Academiai Kiado: Budapest. · Zbl 0283.62006
[2] Akaike, H. (1974). A new look at the statistical model identification.IEEE Transactions on Automatic Control, AC-19, 716–723. · Zbl 0314.62039
[3] Akaike, H. (1976). Canonical correlation analysis of time series and the use of an information criterion. In R. K. Mehra & D. G. Lainiotis (Eds.),System identification (pp. 27–96). New York: Academic Press.
[4] Akaike, H. (1977). On entropy maximization principle. In P. R. Krishnaiah (Ed.),Proceedings of the Symposium on Applications of Statistics (pp. 27–47). Amsterdam: North-Holland. · Zbl 0388.62008
[5] Akaike, H. (1978). On newer statistical approaches to parameter estimation and structure determination.International Federation of Automatic Control, 3, 1877–1884.
[6] Akaike, H. (1979). A Bayesian extension of the minimum AIC procedure of autogressive model fitting.Biometrika, 66, 237–242. · Zbl 0407.62064
[7] Akaike, H. (1981a). Likelihood of a model and information criteria.Journal of Econometrics, 16, 3–14. · Zbl 0457.62032
[8] Akaike, H. (1981b). Modern development of statistical methods. In P. Eykhoff (Ed.),Trends and progress in system identification (pp. 169–184). New York: Pergamon Press.
[9] Akaike, H. (1987). Factor Analysis and AIC.Psychometrika, 52. · Zbl 0627.62067
[10] Anderson, T. W. (1962). The choice of the degree of a polynomial regression as a multiple decision problem.Annals of Mathematical Statistics, 33, 255–265. · Zbl 0124.09304
[11] Atilgan, T. (1983).Parameter parsimony, model selection, and smooth density estimation. Unpublished doctoral dissertation, Madison: University of Wisconsin, Department of Statistics.
[12] Atilgan, T., & Bozdogan, H. (1987, June). Information-theoretic univariate density estimation under different basis functions. A paper presented at the First Conference of the International Federation of Classification Societies, Aachen, West Germany.
[13] Atkinson, A. C. (1980). A note on the generalized information criterion for choice of a model.Biometrika, 67, 413–418. · Zbl 0455.62006
[14] Bhansali, R. J., & Downham, D. Y. (1977). Some properties of the order of an autoregressive model selected by a generalization of Akaike’s FPE criterion.Biometrika, 64, 547–551. · Zbl 0379.62077
[15] Boltzmann, L. (1877). Über die Beziehung zwischen dem zweitin Hauptsatze der mechanischen Wärmetheorie und der Wahrscheinlichkeitsrechnung respective den Sätzen über das Wärmegleichgewicht.Wiener Berichte, 76, 373–435. · JFM 09.0760.01
[16] Čencov, N. N. (1982).Statistical decision rules and optimal inference. Providence, RI: American Mathematical Society.
[17] Clergeot, H. (1984). Filter-order selection in adaptive maximum likelihood estimation.IEEE Transactions on Information Theory, IT-30 (2), 199–210. · Zbl 0546.62068
[18] Cox, D. R., & Hinkley, D. V. (1974).Theoretical statistics. London: Chapman and Hall. · Zbl 0334.62003
[19] Davis, M. H. A., & Vinter, R. B. (1985).Stochastic modelling and control. New York: Chapman and Hall. · Zbl 0654.93001
[20] Efron, B. (1967). The power of the likelihood ratio test.Annals of Mathematical Statistics, 38, 802–806. · Zbl 0158.17803
[21] Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics.Royal Society of London. Philosophical Transactions (Series A),222, 309–368. · JFM 48.1280.02
[22] Graybill, F. A. (1976),Theory and application of the linear model. Boston: Duxbury Press. · Zbl 0371.62093
[23] Hannan, E. J. (1986). Remembrance of things past. In J. Gani (Ed.),The craft of probabilistic modelling. New York: Springer-Verlag.
[24] Hannan, E. J., & Quinn, B. G. (1979). The determination of the order of an autoregression.Journal of the Royal Statistical Society, (Series B),41, 190–195. · Zbl 0408.62076
[25] Haughton, D. (1983). On the choice of a model to fit data from an exponential family. Unpublished doctoral dissertaion, Massachusetts Institute of Technology, Department of Mathematics, Cambridge, MA.
[26] Jaynes, E. T. (1957). Information theory and statistical mechanics.Physical Review, 106, 620–630. · Zbl 0084.43701
[27] Kashyap, R. L. (1982). Optimal choice of AR and MA parts in autoregressive moving average models.IEEE Transactions on Pattern Analysis and Machine Intelligence, 4, 99–104. · Zbl 0514.62009
[28] Kendall, M. G., & Stuart, M. A. (1967).The Advanced Theory of Statistics, Vol. 2, Second Edition. New York: Hafner Publishing. · Zbl 0416.62001
[29] Kitagawa, G. (1979). On the use of AIC for the detection of outliers.Technometrics, 21, 193–199. · Zbl 0421.62008
[30] Kullback, S. (1959).Information theory and statistics. New York: John Wiley & Sons. · Zbl 0088.10406
[31] Kullback, S., & Leibler, R. A. (1951). On information and sufficiency.Annals of Mathematical Statistics, 22, 79–86. · Zbl 0042.38403
[32] Larimore, W. E., & Mehra, R. K. (1985, October). The problems of overfitting data.Byte, pp. 167–180.
[33] Lindley, D. V. (1968). The choice of variables in multiple regression (with discussion).Journal of the Royal Statistical Scociety (Series B),30, 31–36. · Zbl 0155.26702
[34] Neyman, J., & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference.Biometrika, 20A, 175–240 (Part I), 263–294 (Part II). · JFM 54.0565.05
[35] Neyman, J., & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypotheses.Royal Society of London. Philosophical Transactions. (Series A),231, 289–337. · Zbl 0006.26804
[36] Parzen, E. (1982). Data modeling using quantile and density-quantile functions. In J. T. de Oliveira & B. Epstein (Eds.),Some recent advances in statistics (pp. 23–52). London: Academic Press. · Zbl 0555.62007
[37] Quinn, B. G. (1980). Order determination for a multivariate autoregression.Journal of the Royal Statistical Society (Series B),42, 182–185. · Zbl 0444.62103
[38] Rissanen, J. (1978). Modeling by shortest data description.Automatica, 14, 465–471. · Zbl 0418.93079
[39] Schwarz, G. (1978). Estimating the dimension of a model.Annals of Statistics, 6, 461–464. · Zbl 0379.62005
[40] Sclove, S. L. (1987). Application of model-selection criteria to some problems in multivariate analysis.Psychometrika, 52.
[41] Shibata, R. (1983). A theoretical view of the use of AIC. In O. D. Anderson (Ed.),Time series analysis: Theory and practice, Vol. 4 (pp. 237–244). Amsterdam: North-Holland.
[42] Silvey, S. D. (1975).Statistical inference. London: Chapman and Hall. · Zbl 0323.62002
[43] Stone, C. J. (1981). Admissible selection of an accurate and parsimonious normal linear regression model.Annals of Statistics, 9, 475–485. · Zbl 0499.62056
[44] Teräsvirta, T., & Mellin, I. (1986). Model selection criteria and model selection tests in regression models.Scandinavian Journal of Statistics, 13, 159–171. · Zbl 0623.62065
[45] Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large.Transactions of the American Mathematical Society, 54, 426–482. · Zbl 0063.08120
[46] White, H. (1982). Maximum likelihood estimation of misspecified models.Econometrica, 50, 1–26. · Zbl 0478.62088
[47] Wilks, S. S. (1962).Mathematical Statistics. New York: John Wiley & Sons. · Zbl 0173.45805
[48] Woodroofe, M. (1982). On model selection and the arc sine laws.Annals of Statistics, 10, 1182–1194. · Zbl 0507.62037
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.