×

Maximum likelihood characterization of distributions. (English) Zbl 1400.62029

Summary: A famous characterization theorem due to C.F. Gauss states that the maximum likelihood estimator (MLE) of the parameter in a location family is the sample mean for all samples of all sample sizes if and only if the family is Gaussian. There exist many extensions of this result in diverse directions, most of them focussing on location and scale families. In this paper, we propose a unified treatment of this literature by providing general MLE characterization theorems for one-parameter group families (with particular attention on location and scale parameters). In doing so, we provide tools for determining whether or not a given such family is MLE-characterizable, and, in case it is, we define the fundamental concept of minimal necessary sample size at which a given characterization holds. Many of the cornerstone references on this topic are retrieved and discussed in the light of our findings, and several new characterization theorems are provided. Of particular interest is that one part of our work, namely the introduction of so-called equivalence classes for MLE characterizations, is a modernized version of Daniel Bernoulli’s viewpoint on maximum likelihood estimation.

MSC:

62E10 Characterization and structure theory of statistical distributions
62F10 Point estimation

References:

[1] Aczél, J. and Dhombres, J. (1989). Functional Equations in Several Variables with Applications to Mathematics , Information Theory and to the Natural and Social Sciences. Encyclopedia of Mathematics and Its Applications 31 . Cambridge: Cambridge Univ. Press. · Zbl 0685.39006
[2] Akaike, H. (1977). On entropy maximization principle. In Applications of Statistics (P.R. Krishnaiah, ed.) 27-41. Amsterdam, The Netherlands: North-Holland. · Zbl 0388.62008
[3] Akaike, H. (1978). A Bayesian analysis of the minimum AIC procedure. Ann. Inst. Statist. Math. 30 9-14. · Zbl 0441.62007 · doi:10.1007/BF02480194
[4] Azzalini, A. and Genton, M.G. (2007). On Gauss’s characterization of the normal distribution. Bernoulli 13 169-174. · Zbl 1111.62012 · doi:10.3150/07-BEJ5166
[5] Bondesson, L. (1997). A generalization of Poincaré’s characterization of exponential families. J. Statist. Plann. Inference 63 147-155. · Zbl 0884.62016 · doi:10.1016/S0378-3758(95)00007-0
[6] Bourguin, S. and Tudor, C.A. (2011). Cramér theorem for gamma random variables. Electron. Commun. Probab. 16 365-378. · Zbl 1225.60039 · doi:10.1214/ECP.v16-1639
[7] Buczolich, Z. and Székely, G.J. (1989). When is a weighted average of ordered sample elements a maximum likelihood estimator of the location parameter? Adv. in Appl. Math. 10 439-456. · Zbl 0701.62037 · doi:10.1016/0196-8858(89)90024-9
[8] Campbell, L.L. (1970). Equivalence of Gauss’s principle and minimum discrimination information estimation of probabilities. Ann. Math. Statist. 41 1011-1015. · Zbl 0198.23501 · doi:10.1214/aoms/1177696977
[9] Chatterjee, S.K. (2003). Statistical Thought : A Perspective and History . Oxford: Oxford Univ. Press. · Zbl 1029.01012 · doi:10.1093/acprof:oso/9780198525318.001.0001
[10] Chen, L.H.Y. (1975). Poisson approximation for dependent trials. Ann. Probab. 3 534-545. · Zbl 0335.60016 · doi:10.1214/aop/1176996359
[11] Chen, L.H.Y., Goldstein, L. and Shao, Q.M. (2010). Normal Approximation by Stein’s Method. Springer Series in Probability and Its Applications . New York: Springer.
[12] Cover, T.M. and Thomas, J.A. (2006). Elements of Information Theory , 2nd ed. Hoboken, NJ: Wiley. · Zbl 1140.94001
[13] Cramér, H. (1936). Über eine Eigenschaft der Normalen Verteilungsfunktion. Math. Z. 41 405-414. · Zbl 0014.12101 · doi:10.1007/BF01180430
[14] Cramér, H. (1946). A contribution to the theory of statistical estimation. Skand. Aktuarietidskr. 29 85-94. · Zbl 0060.30513
[15] Cramér, H. (1946). Mathematical Methods of Statistics. Princeton Mathematical Series 9 . Princeton, NJ: Princeton Univ. Press. · Zbl 0063.01014
[16] Duerinckx, M. and Ley, C. (2012). Maximum likelihood characterization of rotationally symmetric distributions on the sphere. Sankhyā Ser. A 74 249-262. · Zbl 1283.62107 · doi:10.1007/s13171-012-0004-x
[17] Ferguson, T.S. (1962). Location and scale parameters in exponential families of distributions. Ann. Math. Statist. 33 986-1001. · Zbl 0109.37605 · doi:10.1214/aoms/1177704466
[18] Findeisen, P. (1982). Die Charakterisierung der Normalverteilung nach Gauß. Metrika 29 55-63. · Zbl 0493.62013 · doi:10.1007/BF01893364
[19] Galambos, J. (1972). Characterization of certain populations by independence of order statistics. J. Appl. Probab. 9 224-230. · Zbl 0227.62009 · doi:10.2307/3212654
[20] Gauss, C.F. (1809). Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium. Cambridge Library Collection . Cambridge: Cambridge Univ. Press. Reprint of the 1809 original.
[21] Ghosh, J.K. and Rao, C.R. (1971). A note on some translation parameter families of densities for which the median is an m.l.e. Sankhyā Ser. A 33 91-93. · Zbl 0224.62002
[22] Haikady, N.N. (2006). Characterizations of Probability Distributions , part a ed. Springer Handbook of Engineering Statistics . London: Springer.
[23] Hald, A. (1998). A History of Mathematical Statistics from 1750 to 1930. Wiley Series in Probability and Statistics : Texts and References Section . New York: Wiley. · Zbl 0979.01012
[24] Hürlimann, W. (1998). On the characterization of maximum likelihood estimators for location-scale families. Comm. Statist. Theory Methods 27 495-508. · Zbl 0894.62024 · doi:10.1080/03610929808832108
[25] Jaynes, E.T. (1957). Information theory and statistical mechanics. Phys. Rev. (2) 106 620-630. · Zbl 0084.43701 · doi:10.1103/PhysRev.106.620
[26] Jones, M.C. and Pewsey, A. (2009). Sinh-arcsinh distributions. Biometrika 96 761-780. · Zbl 1183.62019 · doi:10.1093/biomet/asp053
[27] Kagan, A.M., Linnik, Yu.V. and Rao, C.R. (1973). Characterization Problems in Mathematical Statistics . New York: Wiley. · Zbl 0271.62002
[28] Kendall, M.G. (1961). Studies in the history of probability and statistics. XI. Daniel Bernoulli on maximum likelihood. Biometrika 48 1-2. · Zbl 0099.24403 · doi:10.1093/biomet/48.1-2.1
[29] Kotz, S. (1974). Characterizations of statistical distributions: A supplement to recent surveys. Int. Statist. Rev. 42 39-65. · Zbl 0283.62014 · doi:10.2307/1402684
[30] Lehmann, E.L. and Casella, G. (1998). Theory of Point Estimation , 2nd ed. Springer Texts in Statistics . New York: Springer. · Zbl 0916.62017
[31] Ley, C. and Paindaveine, D. (2010). Multivariate skewing mechanisms: A unified perspective based on the transformation approach. Statist. Probab. Lett. 80 1685-1694. · Zbl 1219.60009 · doi:10.1016/j.spl.2010.07.004
[32] Ley, C. and Paindaveine, D. (2010). On the singularity of multivariate skew-symmetric models. J. Multivariate Anal. 101 1434-1444. · Zbl 1196.60024 · doi:10.1016/j.jmva.2009.10.008
[33] Lukacs, E. (1956). Characterization of populations by properties of suitable statistics. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability , 1954 - 1955, Vol. II 195-214. Berkeley, CA: Univ. California Press. · Zbl 0071.34602
[34] Marshall, A.W. and Olkin, I. (1993). Maximum likelihood characterizations of distributions. Statist. Sinica 3 157-171. · Zbl 0826.60011
[35] Norden, R.H. (1972). A survey of maximum likelihood estimation. Int. Statist. Rev. 40 329-254. · Zbl 0252.62023 · doi:10.2307/1402471
[36] Park, S.Y. and Bera, A.K. (2009). Maximum entropy autoregressive conditional heteroskedasticity model. J. Econometrics 150 219-230. · Zbl 1429.62691 · doi:10.1016/j.jeconom.2008.12.014
[37] Patil, G.P. and Seshadri, V. (1964). Characterization theorems for some univariate probability distributions. J. R. Stat. Soc. Ser. B Stat. Methodol. 26 286-292. · Zbl 0123.36502
[38] Poincaré, H. (1912). Calcul des Probabilités . Paris: Carré-Naud. · JFM 43.0308.04
[39] Puig, P. (2003). Characterizing additively closed discrete models by a property of their maximum likelihood estimators, with an application to generalized Hermite distributions. J. Amer. Statist. Assoc. 98 687-692. · Zbl 1040.62006 · doi:10.1198/016214503000000594
[40] Puig, P. (2008). A note on the harmonic law: A two-parameter family of distributions for ratios. Statist. Probab. Lett. 78 320-326. · Zbl 1130.62014 · doi:10.1016/j.spl.2007.07.024
[41] Puig, P. and Valero, J. (2006). Count data distributions: Some characterizations with applications. J. Amer. Statist. Assoc. 101 332-340. · Zbl 1118.62307 · doi:10.1198/016214505000000718
[42] Ross, N. (2011). Fundamentals of Stein’s method. Probab. Surv. 8 210-293. · Zbl 1245.60033 · doi:10.1214/11-PS182
[43] Stein, C. (1972). A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability ( Univ. California , Berkeley , Calif. , 1970 / 1971), Vol. II : Probability Theory 583-602. Berkeley, CA: Univ. California Press. · Zbl 0278.60026
[44] Stigler, S.M. (1999). Statistics on the Table : The History of Statistical Concepts and Methods . Cambridge, MA: Harvard Univ. Press. · Zbl 0997.62506
[45] Stigler, S.M. (2007). The epic story of maximum likelihood. Statist. Sci. 22 598-620. · Zbl 1246.01016 · doi:10.1214/07-STS249
[46] Teicher, H. (1961). Maximum likelihood characterization of distributions. Ann. Math. Statist. 32 1214-1222. · Zbl 0102.14702 · doi:10.1214/aoms/1177704861
[47] von Mises, R. (1918). Über die Ganzzahligkeit der Atomgewichte und verwandte Fragen. Physikalische Zeitschrift 19 490-500. · JFM 46.1493.01
[48] Wu, X. (2003). Calculation of maximum entropy densities with application to income distribution. J. Econometrics 115 347-354. · Zbl 1016.62094 · doi:10.1016/S0304-4076(03)00114-3
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.