×

zbMATH — the first resource for mathematics

The epic story of maximum likelihood. (English) Zbl 1246.01016
Summary: At a superficial level, the idea of maximum likelihood must be prehistoric: early hunters and gatherers may not have used the words “method of maximum likelihood” to describe their choice of where and how to hunt and gather, but it is hard to believe they would have been surprised if their method had been described in those terms. It seems a simple, even unassailable idea: Who would rise to argue in favor of a method of minimum likelihood, or even mediocre likelihood? And yet the mathematical history of the topic shows this “simple idea” is really anything but simple. Joseph Louis Lagrange, Daniel Bernoulli, Leonard Euler, Pierre Simon Laplace and Carl Friedrich Gauss are only some of those who explored the topic, not always in ways we would sanction today. In this article, that history is reviewed from back well before Fisher to the time of Lucien Le Cam’s dissertation. In the process Fisher’s unpublished 1930 characterization of conditions for the consistency and efficiency of maximum likelihood estimates is presented, and the mathematical basis of his three proofs discussed. In particular, Fisher’s derivation of the information inequality is seen to be derived from his work on the analysis of variance, and his later approach via estimating functions was derived from Euler’s Relation for homogeneous functions. The reaction to Fisher’s work is reviewed, and some lessons drawn.

MSC:
01A60 History of mathematics in the 20th century
62-03 History of statistics
PDF BibTeX XML Cite
Full Text: DOI Euclid
References:
[1] Aldrich, J. (1997). R. A. Fisher and the making of maximum likelihood 1912-1922. Statist. Sci. 12 162-176. · Zbl 0955.62525 · doi:10.1214/ss/1030037906
[2] Arrow, K. J. and Lehmann, E. L. (2005). Harold Hotelling 1895-1973. Biographical Memoirs of the National Academy of Sciences 87 3-15.
[3] Bahadur, R. R. (1964). On Fisher’s bound for asymptotic variances. Ann. Math. Statist. 35 1545-1552. · Zbl 0218.62034 · doi:10.1214/aoms/1177700378
[4] Bahadur, R. R. (1983). Hodges superefficiency. In Encyclopedia of Statistical Sciences (S. Kotz and N. L. Johnson, eds.) 3 645-646.
[5] Bennett, J. H., ed. (1990). Statistical Inference and Analysis : Selected Correspondence of R. A. Fisher . Clarendon Press, Oxford. · Zbl 0712.01007
[6] Bernoulli, D. (1769). Dijudicatio maxime probabilis plurium observationum discrepantium atque verisimillima inductio inde formanda. Manuscript; Bernoulli MSS f.299-305, University of Basel. English translation in Stigler (1997).
[7] Bernoulli, D. (1778). Dijudicatio maxime probabilis plurium observationum discrepantium atque verisimillima inductio inde formanda. Acta Academiae Scientiarum Imperialis Petropolitanae for 1777, pars prior 3-23. Reprinted in Bernoulli (1982). English translation in Kendall (1961) 3-13, reprinted 1970 in Pearson, Egon S. and Kendall, M. G. (eds.), Studies in the History of Statistics and Probability , pp. 157-167. Charles Griffin, London.
[8] Bernoulli, D. (1982). Die Werke von Daniel Bernoulli. Band 2. Analysis. Wahrscheinlichkeitsrechnung . Birkhäuser, Basel. · Zbl 0491.01008
[9] Bickel, P. J. and Doksum, K. (2001). Mathematical Statistics. Basic Ideas and Selected Topics , 2nd ed. 1 . Prentice Hall, Upper Saddle River, NJ. · Zbl 0403.62001
[10] Biometrics (1951). News and Notes. Biometrics 7 449-450.
[11] Bowley, A. L. (1928). F. Y. Edgeworth ’ s Contributions to Mathematical Statistics . Royal Statistical Society, London. (Reprinted 1972 by Augustus M. Kelley, Clifton, NJ.) · JFM 54.0573.06
[12] Box, J. F. (1978). R. A. Fisher. The Life of a Scientist . Wiley, New York. · Zbl 0666.01016
[13] Courant, R. (1936). Differential and Integral Calculus . Nordeman, New York. · JFM 62.1165.04
[14] Cox, D. R. (2006). Principles of Statistical Inference . Cambridge Univ. Press. · Zbl 1102.62002
[15] Cramér, H. (1946). Mathematical Methods of Statistics . Princeton Univ. Press. · Zbl 0063.01014
[16] Cramér, H. (1946a). A contribution to the theory of statistical estimation. Skand. Aktuarietidskr. 29 85-94. Reprinted in H. Cramér, Collected Works 2 948-957. Springer, Berlin (1994). · Zbl 0060.30513
[17] Darnell, A. C. (1988). Harold Hotelling 1895-1973. Statist. Sci. 3 57-62. · Zbl 0955.01520
[18] Doob, J. L. (1934). Probability and statistics. Trans. Amer. Math. Soc. 36 759-775. JSTOR: · Zbl 0010.17303 · doi:10.2307/1989822 · links.jstor.org
[19] Doob, J. L. (1936). Statistical estimation. Trans. Amer. Math. Soc. 39 410-421. JSTOR: · Zbl 0014.16901 · doi:10.2307/1989759 · links.jstor.org
[20] Dugué, D. (1937). Application des propriétés de la limite au sens du calcul des probabilités a l’étude de diverse questions d’estimation. J. l ’ École Polytechnique 3 e série (n. 4) 305-373. · Zbl 0018.03401 · eudml:192873
[21] Edwards, A. W. F. (1974). The history of likelihood. Internat. Statist. Rev. 42 9-15. JSTOR: · Zbl 0289.62006 · doi:10.2307/1402681 · links.jstor.org
[22] Edwards, A. W. F. (1997). Three early papers on efficient parametric estimation. Statist. Sci. 12 35-47. · Zbl 0955.62507 · doi:10.1214/ss/1029963260
[23] Edwards, A. W. F. (1997a). What did Fisher mean by “inverse probability” in 1912-1922? Statist. Sci. 12 177-184. · Zbl 0955.62526 · doi:10.1214/ss/1030037907
[24] Efron, B. (1975). Defining the curvature of a statistical problem (with applications to second order efficiency). Ann. Statist. 3 1189-1242. · Zbl 0321.62013 · doi:10.1214/aos/1176343282
[25] Efron, B. (1978). The geometry of exponential families. Ann. Statist. 6 362-376. · Zbl 0436.62027 · doi:10.1214/aos/1176344130
[26] Efron, B. (1982). Maximum likelihood and decision theory (The 1981 Wald Memorial Lectures). Ann. Statist. 10 340-356. · Zbl 0494.62004 · doi:10.1214/aos/1176345778
[27] Efron, B. (1998). R. A. Fisher in the 21st century (with discussion). Statist. Sci. 13 95-122. · Zbl 1074.01536 · doi:10.1214/ss/1028905930
[28] Efron, B. and Hinkley, D. V. (1978). Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information. Biometrika 65 457-482. JSTOR: · Zbl 0401.62002 · doi:10.1093/biomet/65.3.457 · links.jstor.org
[29] Fienberg, S. E. and Hinkley, D. V. eds. (1980). R. A. Fisher : An Appreciation . Springer, New York. · Zbl 0436.62002
[30] Fisher, R. A. (1912). On an absolute criterion for fitting frequency curves. Messenger of Mathematics 41 155-160; reprinted as Paper 1 in Fisher (1974); reprinted in Edwards (1997). · JFM 43.0302.01
[31] Fisher, R. A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10 507-521; reprinted as Paper 4 in Fisher (1974). · Zbl 0070.37304
[32] Fisher, R. A. (1920). A mathematical examination of the methods of determining the accuracy of an observation by the mean error, and by the mean square error. Mon. Notices Roy. Astron. Soc. 80 758-770; reprinted as Paper 12 in Fisher (1974).
[33] Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philos. Trans. Roy. Soc. London Ser. A 222 309-368; reprinted as Paper 18 in Fisher (1974). · JFM 48.1280.02
[34] Fisher, R. A. (1922a). On the interpretation of \chi 2 from contingency tables, and the calculation of P. J. Roy. Statist. Soc. 85 87-94; reprinted as Paper 19 in Fisher (1974).
[35] Fisher, R. A. (1924). The Influence of Rainfall on the Yield of Wheat at Rothamsted. Philos. Trans. Roy. Soc. London Ser. B 213 89-142; reprinted as Paper 37 in Fisher (1974).
[36] Fisher, R. A. (1924a). Conditions under which \chi 2 measures the discrepancy between observation and hypothesis. J. Roy. Statist. Soc. 87 442-450; reprinted as Paper 34 in Fisher (1974).
[37] Fisher, R. A. (1925). Theory of statistical estimation. Proc. Cambridge Philos. Soc. 22 700-725; reprinted as Paper 42 in Fisher (1974). · JFM 51.0385.01
[38] Fisher, R. A. (1931). Letter to the Editor. Amer. Math. Monthly 38 335-338.
[39] Fisher, R. A. (1935). The logic of inductive inference. J. Roy. Statist. Soc. 98 39-54; reprinted as Paper 124 in Fisher (1974). · JFM 61.1308.06
[40] Fisher, R. A. (1938). Statistical Theory of Estimation . Univ. Calcutta. · Zbl 0019.35703
[41] Fisher, R. A. (1938-1939). Review of “Lectures and Conferences on Mathematical Statistics” by J. Neyman. Science Progress 33 577.
[42] Fisher, R. A. (1950). Contributions to Mathematical Statistics . Wiley, New York. · Zbl 0040.36201
[43] Fisher, R. A. (1956). Statistical Methods and Scientific Inference . Oliver and Boyd, Edinburgh. · Zbl 0070.36903
[44] Fisher, R. A. (1974). The Collected Papers of R. A. Fisher U. of Adelaide Press.
[45] Galton, F. (1908). Memories of my Life . Methuen, London.
[46] Gauss, C. F. (1809). Theoria Motus Corporum Coelestium . Perthes et Besser, Hamburg. Translated, 1857, as Theory of Motion of the Heavenly Bodies Moving about the Sun in Conic Sections , trans. C. H. Davis. Little, Brown; Boston. Reprinted, 1963, Dover, New York.
[47] Grove, C. C. (1930). Review of “Statistical Methods for Research Workers.” Amer. Math. Monthly 37 547-550.
[48] Hald, A. (1998). A History of Mathematical Statistics from 1750 to 1930 . Wiley, New York. · Zbl 0979.01012
[49] Hald, A. (2007). A History of Parametric Statistical Inference from Bernoulli to Fisher , 1713 to 1935 . Springer, New York. · Zbl 1107.01006
[50] Hinkley, D. V. (1980). Theory of statistical estimation: The 1925 paper. Pp. 85-94 in Fienberg and Hinkley (1980).
[51] Hotelling, H. (1930). The consistency and ultimate distribution of optimum statistics. Trans. Amer. Math. Soc. 32 847-859. JSTOR: · JFM 56.0451.05 · doi:10.2307/1989353 · links.jstor.org
[52] Hotelling, H. (1930a). Spaces of statistical parameters (Abstract). Bull. Amer. Math. Soc. 36 191. · JFM 56.0463.24
[53] Hotelling, H. (1951). The impact of R. A. Fisher on statistics. J. Amer. Statist. Assoc. 46 35-46. · Zbl 0042.14001 · doi:10.2307/2280091
[54] Hotelling, H. (1990). The Collected Economic Articles of Harold Hotelling . Springer, New York.
[55] Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems. Proc. Roy. Soc. London Ser. A 186 453-461. · Zbl 0063.03050 · doi:10.1098/rspa.1946.0056
[56] Kass, R. E. (1989). The geometry of asymptotic inference. Statist. Sci. 4 188-219. · Zbl 0955.62513 · doi:10.1214/ss/1177012480
[57] Kass, R. E. and Vos, P. W. (1997). Geometrical Foundations of Asymptotic Inference . Wiley, New York. · Zbl 0880.62005
[58] Kendall, M. G. (1961). Daniel Bernoulli on maximum likelihood. Biometrika 48 1-18. Reprinted in 1970 in Pearson, Egon S. and Kendall, M. G. (eds.), Studies in the History of Statistics and Probability . Charles Griffin, London, pages 155-172. JSTOR: · links.jstor.org
[59] Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many parameters. Ann. Math. Statist. 27 887-906. · Zbl 0073.14701 · doi:10.1214/aoms/1177728066
[60] Kruskal, W. H. (1980). The significance of Fisher: A review of “R. A. Fisher: The Life of a Scientist” by Joan Fisher Box. J. Amer. Statist. Assoc. 75 1019-1030.
[61] Lagrange, J.-L. (1776). Mémoire sur l’utilité de la méthode de prendre le milieu entre les résultats de plusieurs observations; dans lequel on examine les avantages de cette méthode par le calcul d es probabilités, & ou l’on resoud differens problèmes relatifs à cette matière. Miscellanea Taurinensia 5 167-232. Reprinted in Lagrange (1868) 2 173-236.
[62] Lagrange, J.-L. (1868). Oeuvres de Lagrange , 2 . Gauthier-Villars, Paris.
[63] Lambert, J. H. (1760). Photometria , sive de Mensura et Gradibus Luminis , Colorum et Umbrae . Detleffsen, Augsburg. (French translation 1997, L’Harmattan, Paris; English translation 2001, by David L. DiLaura, for The Illuminating Engineering Society of North America).
[64] Laplace, P. S. (1774). Mémoire sur la probabilité des causes par les évènemens. Mémoires de mathématique et de physique , presentés à l ’ Académie Royale des Sciences , par divers savans , & lû dans ses assemblées 6 621-656. Translated in Stigler (1986a).
[65] Lauritzen, S. L. (2002). Thiele : Pioneer in Statistics . Oxford Univ. Press. · Zbl 1027.01013 · doi:10.1093/acprof:oso/9780198509721.001.0001
[66] Le Cam, L. (1953). On some asymptotic properties of maximum likelihood estimates and relates Bayes estimates. University of California Publications in Statistics 1 277-330.
[67] Le Cam, L. (1990). Maximum likelihood: An introduction. Internat. Statist. Rev. 58 153-171 [Previously issued in 1979 by the Statistics Branch of the Department of Mathematics, University of Maryland, as Lecture Notes No. 18]. · Zbl 0715.62045
[68] Littauer, S. B. and Mode, E. B. (1952). Report of the Boston Meeting of the Institute. Ann. Math. Statist. 23 155-159.
[69] Neyman, J. (1937). Outline of a theory of statistical estimation based upon the classical theory of probability. Phil. Trans. Royal Soc. London Ser. A 236 333-380. · Zbl 0017.12403 · doi:10.1098/rsta.1937.0005
[70] Neyman, J. (1938). Lectures and Conferences on Mathematical Statistics (edited by W. Edwards Deming). The Graduate School of the USDA, Washington DC. · Zbl 0018.26503
[71] Neyman, J. and Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica 16 1-32. JSTOR: · Zbl 0034.07602 · doi:10.2307/1914288 · links.jstor.org
[72] Neyman, J. (1951). Review of R. A. Fisher “Contributions to Mathematical Statistics.” The Scientific Monthly 72 406-408.
[73] Norden, R. H. (1972-1973). A survey of maximum likelihood estimation. Internat. Statist. Rev. 40 329-354, 41 39-58. · Zbl 0252.62023
[74] Pearson, K. (1896). Mathematical contributions to the theory of evolution, III: regression, heredity and panmixia. Philos. Trans. Roy. Soc. London Ser. A 187 253-318. Reprinted in Karl Pearson ’ s Early Statistical Papers , Cambridge: Cambridge University Press, 1956, pp. 113-178. · JFM 27.0185.01
[75] Pearson, K. and Filon, L. N. G. (1898). Mathematical contributions to the theory of evolution IV. On the probable errors of frequency constants and on the influence of random selection on variation and correlation. Philos. Trans. Roy. Soc. London Ser. A 191 229-311. Reprinted in Karl Pearson ’ s Early Statistical Papers , Cambridge: Cambridge University Press, 1956, pp. 179-261. · JFM 29.0192.01
[76] Porter, T. M. (2004). Karl Pearson : The Scientific Life in a Statistical Age . Princeton Univ. Press. · Zbl 1069.62001
[77] Pratt, J. W. (1976). F. Y. Edgeworth and R. A. Fisher on the efficiency of maximum likelihood estimation. Ann. Statist. 4 501-514. · Zbl 0328.62001 · doi:10.1214/aos/1176343457
[78] Rao, C. R. (1961). Asymptotic efficiency and limiting information. Proc. Fourth Berkeley Symp. Math. Statist. Probab. 1 531-546. Univ. California Press, Berkeley. · Zbl 0156.39802
[79] Rao, C. R. (1962). Efficient estimates and optimum inference procedures in large samples, with discussion. J. Roy. Statist. Soc. Ser. B 24 46-72. JSTOR: · Zbl 0138.13103 · links.jstor.org
[80] Savage, L. J. (1976). On rereading R. A. Fisher. Ann. Statist. 4 441-500. · Zbl 0325.62008 · doi:10.1214/aos/1176343456
[81] Sheynin, O. B. (1971). J. H. Lambert’s work on probability. Archive for History of Exact Sciences 7 244-256. · Zbl 0263.01017 · doi:10.1007/BF00357218
[82] Smith, K. (1916). On the ‘best’ values of the constants in frequency distributions. Biometrika 11 262-276. · Zbl 1033.62012 · doi:10.1093/biomet/88.1.167
[83] Smith, W. L. (1978). Harold Hotelling 1985-1973. Ann. Statist. 6 1173-1183. JSTOR: · doi:10.1214/aos/1176344369 · links.jstor.org
[84] Stigler, S. M. (1973). Laplace, Fisher, and the discovery of the concept of sufficiency. Biometrika 60 439-445. Reprinted in 1977 in Kendall, Maurice G. and Robin L. Plackett, eds., Studies in the History of Statistics and Probability , Vol. 2. Griffin, London, pp. 271-277. JSTOR: · Zbl 0286.01010 · links.jstor.org
[85] Stigler, S. M. (1986). The History of Statistics : The Measurement of Uncertainty Before 1900 . Harvard Univ. Press, Cambridge, MA. · Zbl 0656.62005
[86] Stigler, S. M. (1986a). Laplace’s 1774 memoir on inverse probability. Statist. Sci. 1 359-378. · Zbl 0618.62002 · doi:10.1214/ss/1177013620
[87] Stigler, S. M. (1997). Daniel Bernoulli, Leonhard Euler, and Maximum Likelihood. In Festschrift for Lucien LeCam (D. Pollard, E. Torgersen and G. Yang, eds.) 345-367. Springer, New York. Extensively revised and reprinted as Chapter 16 of Stigler (1999). · Zbl 0884.01015
[88] Stigler, S. M. (1999). Statistics on the Table . Harvard Univ. Press, Cambridge, MA. · Zbl 0997.62506
[89] Stigler, S. M. (1999a). The Foundations of Statistics at Stanford. Amer. Statist. 53 263-266. JSTOR: · doi:10.2307/2686107 · links.jstor.org
[90] Stigler, S. M. (2001). Ancillary history. In State of the Art in Probability and Statistics (C. M. de Gunst, C. A. J. Klaassen and A. W. van der Vaart, eds.). IMS Lecture Notes Monogr. Ser. 36 555-567. IMS, Beachwood, OH. · Zbl 1373.62013 · doi:10.1214/lnms/1215090089
[91] Stigler, S. M. (2005). Fisher in 1921. Statist. Sci. 20 32-49. · Zbl 1100.01511 · doi:10.1214/088342305000000025
[92] Stigler, S. M. (2007). Karl Pearson’s theoretical errors and the advances they inspired. · Zbl 1327.62013 · doi:10.1214/08-STS256
[93] van der Vaart, A. W. (1997). Superefficiency. In Festschrift for Lucien Le Cam (D. Pollard, E. Torgersen and G. L. Yang, eds.) 397-410. Springer, New York. · Zbl 0897.62025
[94] van der Vaart, A. W. (1998). Asymptotic Statistics . Cambridge Univ. Press. · Zbl 0910.62001 · doi:10.1017/CBO9780511802256
[95] Wald, A. (1940). The fitting of straight lines if both variables are subject to error. Ann. Math. Statist. 11 284-300. [A summary of the main results of this article, as presented in a talk July 6, 1939, was published pp. 25-28 in Report of the Fifth Annual Research Conference on Economics and Statistics Held at Colorado Springs July 3 to 28 , 1939 , Cowles Commission, University of Chicago, 1939.]
[96] Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans. Amer. Math. Soc. 54 426-482. JSTOR: · Zbl 0063.08120 · doi:10.2307/1990256 · links.jstor.org
[97] Wald, A. (1949). Note on the consistency of the maximum likelihood estimate. Ann. Math. Statist. 20 595-601. · Zbl 0034.22902 · doi:10.1214/aoms/1177729952
[98] Yule, G. U. (1936). An Introduction to the Theory of Statistics , 10th ed. Charles Griffin, London. [This was the last edition revised by Yule himself; subsequent revisions from 1937 by M. G. Kendall were not greatly changed in emphasis.]
[99] Zabell, S. L. (1992). R. A. Fisher and the fiducial argument. Statist. Sci. 7 369-387. Reprinted in 2005 in S. L. Zabell, Symmetry and its Discontents : Essays on the History of Inductive Philosophy . Cambridge Univ. Press. · Zbl 0955.62521 · doi:10.1214/ss/1177011233
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.