×

zbMATH — the first resource for mathematics

The emperor’s new tests. (With comments and a rejoinder). (English) Zbl 1059.62515
Summary: In the past two decades, striking examples of allegedly inferior likelihood ratio tests (LRT) have appeared in the statistical literature. These examples, which arise in multiparameter hypothesis testing problems, have several common features. In each case the null hypothesis is composite, the size LRT is not similar and hence biased, and competing size tests can be constructed that are less biased, or even unbiased, and that dominate the LRT in the sense of being everywhere more powerful. It is therefore asserted that in these examples and, by implication, many other testing problems, the LR criterion produces ”inferior,” ”deficient,” ” undesirable,” or ”flawed” statistical procedures.
This message, which appears to be proliferating, is wrong. In each example it is the allegedly superior test that is flawed, not the LRT. At worst, the ”superior” tests provide unwarranted and inappropriate inferences and have been deemed scientifically unacceptable by applied statisticians. This reinforces the well-documented but oft-neglected fact that the Neyman-Pearson theory desideratum of a more or most powerful size test may be scientifically inappropriate; the same is true for the criteria of unbiasedness and-admissibility. Although the LR criterion is not infallible, we believe that it remains a generally reasonable first option for non-Bayesian parametric hypothesis-testing problems.

MSC:
62F03 Parametric hypothesis testing
62A01 Foundations and philosophical topics in statistics
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] AITKEN, M. 1991. Posterior Bayes factors with discussion. J. Roy. Statist. Soc. Ser. B 53 111 142. Z. · Zbl 0800.62167
[2] ANDERSON, S. and HAUCK, W. W. 1983. A new procedure for testing equivalence in comparative bioavailability and other clinical trials. Comm. Statist. Theory Methods 12 2662 2692. Z. · Zbl 0553.62097 · doi:10.1080/03610928308828634
[3] BAHADUR, R. R. 1967. An optimal property of the likelihood ratio statistic. Proc. Fifth Berkeley Symp. Math. Statist. Probab. 1 13 26. Univ. California Press, Berkeley. Z. Z · Zbl 0211.50901
[4] BASU, D. 1975. Statistical information and likelihood with. discussion. Sankhya Ser. A 37 1 71. Z. · Zbl 0332.62005
[5] BERGER, J. O. and WOLPERT, R. L. 1988. The Likelihood Principle, 2nd ed. IMS, Hayward, CA. Z.
[6] BERGER, R. L. 1989. Uniformly more powerful tests for hypotheses concerning linear inequalities and normal means. J. Amer. Statist. Assoc. 84 192 199. Z. JSTOR: · Zbl 0683.62035 · doi:10.2307/2289863 · links.jstor.org
[7] BERGER, R. L. and HSU, J. C. 1996. Bioequivalence trials, intersection-union tests and equivalence confidence sets Z. with discussion. Statist. Sci. 11 283 319. Z. · Zbl 0955.62555 · doi:10.1214/ss/1032280304
[8] BERGER, R. L. and SINCLAIR, D. 1984. Testing hypotheses concerning unions of linear subspaces. J. Amer. Statist. Assoc. 79 158 163. Z. JSTOR: · Zbl 0553.62053 · doi:10.2307/2288351 · links.jstor.org
[9] BROWN, L. D. 1990. An ancillarity paradox which appears in Z. multiple linear regression with discussion. Ann. Statist. 18 471 538. Z. · Zbl 0721.62011 · doi:10.1214/aos/1176347602
[10] BROWN, L. D., HWANG, J. T. G. and MUNK, A. 1997. An unbiased test for the bioequivalence problem. Ann. Statist. 25 2345 2367. Z. · Zbl 0905.62107 · doi:10.1214/aos/1030741076
[11] COHEN, A., GATSONIS, C. and MARDEN, J. I. 1983. Hypothesis tests and optimality properties in discrete multivariate analysis. In Studies in Econometrics, Time Series, and MulZ tivariate Statistics S. Karzin, T. Amemiya and L. A. Good. man, eds. 379 405. Academic Press, New York. Z. · Zbl 0543.62044
[12] COHEN, A., KEMPERMAN, J. H. B. and SACKROWITZ, H. B. 1997. A critique of likelihood inference for order restricted models. Technical Report 97-010, Dept. Statistics, Rutgers Univ. Z. · Zbl 0978.62046
[13] COHEN, A. and SACKROWITZ, H. B. 1998. Directional tests for one-sided alternatives in multivariate models. Ann. Statist. 26 2321 2338. Z. · Zbl 0927.62056 · doi:10.1214/aos/1024691473
[14] CORNFIELD, J. 1969. The Bayesian outlook and its applications Z. with discussion. Biometrics 25 617 657. JSTOR: · doi:10.2307/2528565 · links.jstor.org
[15] COX, D. R. and HINKLEY, D. V. 1974. Theoretical Statistics. Chapman and Hall, London. Z. · Zbl 0334.62003
[16] DAWID, A. P. 1991. Fisherian inference in likelihood and preZ. quential frames of reference with discussion. J. Roy. Statist. Soc. Ser. B 53 79 109. Z. JSTOR: · Zbl 0800.62028 · links.jstor.org
[17] DEMPSTER, A. P. 1997. The direct use of likelihood for signifiZ cance testing. Statist. Comput. 7 247 252. Originally pub. lished in 1973. Z.
[18] EATON, M. L. 1989. Group Invariance Applications in Statistics. Regional Conference Series in Probability and Statistics 1. · Zbl 0749.62005
[19] IMS, Hayward, CA. Z.
[20] EDWARDS, A. W. F. 1972. Likelihood. Cambridge Univ. Press. Z. · Zbl 0231.62005
[21] GUTMANN, S. 1987. Tests uniformly more powerful than uniformly most powerful monotone tests. J. Statist. Plann. Inference 17 279 292. Z. · Zbl 0635.62021 · doi:10.1016/0378-3758(87)90120-0
[22] HACKING, I. 1965. Logic of Statistical Inference. Cambridge Univ. Press. Z. · Zbl 0133.41604
[23] HOEFFDING, W. and WOLFOWITZ, J. 1958. Distinguishability of sets of distributions. Ann. Math. Statist. 29 700 718. Z. · Zbl 0135.19404 · doi:10.1214/aoms/1177706531
[24] KIEFER, J. 1977. Conditional confidence statements and confiZ. dence estimators with discussion. J. Amer. Statist. Assoc. 72 789 827. Z. JSTOR: · Zbl 0375.62023 · doi:10.2307/2286460 · links.jstor.org
[25] LASKA, E. M. and MEISNER, M. J. 1989. Testing whether an identified treatment is best. Biometrics 45 1139 1151. Z. JSTOR: · Zbl 0715.62248 · doi:10.2307/2531766 · links.jstor.org
[26] LASKA, E. M., TANG, D.-I. and MEISNER, M. J. 1992. Testing hypotheses about an identified treatment when there are multiple endpoints. J. Amer. Statist. Assoc. 87 825 831. Z. JSTOR: · Zbl 0781.62078 · doi:10.2307/2290221 · links.jstor.org
[27] LEHMANN, E. L. 1950. Some principles of the theory of testing hypotheses. Ann. Math. Statist. 21 1 26. Z. · Zbl 0036.09501 · doi:10.1214/aoms/1177729884
[28] LEHMANN, E. L. 1952. Testing multiparameter hypotheses. Ann. Math. Statist. 23 541 562. Z. · Zbl 0048.11702 · doi:10.1214/aoms/1177729333
[29] LEHMANN, E. L. 1986. Testing Statistical Hypotheses. Wiley, New York. Z. · Zbl 0608.62020
[30] LEHMANN, E. L. 1993. The Fisher, Neyman Pearson theories of testing hypotheses: one theory or two? J. Amer. Statist. Assoc. 88 1242 1249. Z. JSTOR: · Zbl 0805.62023 · doi:10.2307/2291263 · links.jstor.org
[31] LEHMANN, E. L. 1998. Letter to M. D. Perlman, 8 November 1998. Z.
[32] LIU, H. and BERGER, R. L., 1995. Uniformly more powerful, one-sided tests for hypotheses about linear inequalities. Ann. Statist. 23 55 72. Z. · Zbl 0821.62011 · doi:10.1214/aos/1176324455
[33] MARDEN, J. I. and PERLMAN, M. D. 1980. Invariant tests for means with covariates. Ann. Statist. 8 25 63. Z. · Zbl 0454.62049 · doi:10.1214/aos/1176344890
[34] MCDERMOTT, M. P. and WANG, Y. 2000. Construction of uniformly more powerful tests for hypotheses about linear inequalities. J. Statist. Plann. Inference. To appear. Z. · Zbl 1016.62070 · doi:10.1016/S0378-3758(02)00253-7
[35] MENENDEZ, J. A., RUEDA, C. and SALVADOR, B. 1992. Dominance of likelihood ratio tests under cone constraints. Ann. Statist. 20 2087 2099. Z. · Zbl 0774.62057 · doi:10.1214/aos/1176348904
[36] MENENDEZ, J. A. and SALVADOR, B. 1991. Anomalies of the likelihood ratio test for testing restricted hypotheses. Ann. Statist. 19 889 898. Z. · Zbl 0734.62058 · doi:10.1214/aos/1176348126
[37] MUKERJEE, H. and TU, R. 1995. Order-restricted inferences in linear regression. J. Amer. Statist. Assoc. 90 717 728. Z. JSTOR: · Zbl 0826.62050 · doi:10.2307/2291084 · links.jstor.org
[38] MUNK, A. 1999. A note on unbiased testing for the equivalence problem. Statist. Probab. Lett. 41 401 406. Z. · Zbl 0932.62116 · doi:10.1016/S0167-7152(98)00196-5
[39] NEYMAN, J. and PEARSON, E. S. 1928. On the use and interpretation of certain test criteria for purposes of statistical inference I, II. Biometrika 20A 175 240, 263 294. Z. · JFM 54.0565.05
[40] NEYMAN, J. and PEARSON, E. S. 1933. On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. Roy. Soc. London Ser. A 231 289 337. Z. · Zbl 0006.26804 · doi:10.1098/rsta.1933.0009
[41] NOMAKUCHI, K. and SAKATA, T. 1987. A note on testing twodimensional normal mean. Ann. Inst. Statist. Math. 39 489 495. Z. · Zbl 0659.62064 · doi:10.1007/BF02491485
[42] PATEL, H. I. and GUPTA, G. D. 1984. A problem of equivalence in clinical trials. Biometrical J. 26 471 474. Z.
[43] PERLMAN, M. D. 1969. One-sided testing problems in multivariZ ate analysis. Ann. Math. Statist. 40 549 567 Correction:. Ann. Math. Statist. 41 1777. Z. · Zbl 0179.24001 · doi:10.1214/aoms/1177697723
[44] PERLMAN, M. D. and WU, L. 2000a. A class of conditional tests for multivariate one-sided alternatives. J. Statist. Plann. Inference. To appear. Z. · Zbl 1137.62343 · doi:10.1016/S0378-3758(02)00250-1
[45] PERLMAN, M. D. and WU, L. 2000b. A defense of the likelihood ratio test for one-sided and order-restricted alternatives. J. Statist. Plann. Inference. To appear. Z. · Zbl 1095.62502 · doi:10.1016/S0378-3758(02)00251-3
[46] POCOCK, S., GELLER, N. L. and TSIATIS, A. A. 1987. The analysis of multiple endpoints in clinical trials. Biometrics 43 465 472. Z. JSTOR: · doi:10.2307/2531989 · links.jstor.org
[47] PRATT, J. W. 1961. Review of Testing Statistical Hypotheses Z. 1959 by E. L. Lehmann. J. Amer. Statist. Assoc. 56 153 156. Z.
[48] ROBERTSON, T., WRIGHT, F. T. and DYKSTRA, R. L. 1988. OrderRestricted Statistical Inference. Wiley, New York. Z. · Zbl 0645.62028
[49] ROCKE, D. M. 1984. On testing for bioequivalence. Biometrics 40 225 230. Z. JSTOR: · Zbl 0533.62090 · doi:10.2307/2530763 · links.jstor.org
[50] ROYALL, R. M. 1997. Statistical Evidence: A Likelihood Paradigm. Chapman and Hall, London. Z. RUSSEK-COHEN, E. and SIMON, R. 1993. Qualitative interactions in multifactor studies. Biometrics 49 467 477. Z.
[51] SASABUCHI, S. 1980. A test of a multivariate normal mean with composite hypotheses determined by linear inequalities. Biometrika 67 429 439. Z. JSTOR: · Zbl 0437.62053 · doi:10.1093/biomet/67.2.429 · links.jstor.org
[52] SOLOMON, D. L. 1975. A note on the non-equivalence of the Neyman Pearson and generalized likelihood ratio tests for testing a simple null versus a simple alternative hypothesis. Amer. Statist. 29 101 102. Z. JSTOR: · Zbl 0327.62014 · doi:10.2307/2683277 · links.jstor.org
[53] TANG, D.-I. 1994. Uniformly more powerful tests in a onesided multivariate problem. J. Amer. Statist. Assoc. 89 1006 1011. Z. JSTOR: · Zbl 0816.62046 · doi:10.2307/2290927 · links.jstor.org
[54] TANG, D.-I. 1998. Testing the hypothesis of a normal mean lying outside a convex cone. Comm. Statist. Theory Methods 27 1517 1534. Z. · Zbl 0901.62076 · doi:10.1080/03610929808832174
[55] TANG, D.-I., GELLER, N. L. and POCOCK, S. J. 1993. On the design and analysis of randomized clinical trials with multiple endpoints. Biometrics 49 23 30. Z. JSTOR: · Zbl 0775.62313 · doi:10.2307/2532599 · links.jstor.org
[56] WALD, A. 1941a. Asymptotically most powerful tests of statistical hypotheses. Ann. Math. Statist. 12 1 19. Z. · Zbl 0024.42904 · doi:10.1214/aoms/1177731783
[57] WALD, A. 1941b. Some examples of asymptotically most powerful tests. Ann. Math. Statist. 12 396 408. Z. · Zbl 0063.08116 · doi:10.1214/aoms/1177731678
[58] WALD, A. 1943. Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans. Amer. Math. Soc. 54 426 482. Z. JSTOR: · Zbl 0063.08120 · doi:10.2307/1990256 · links.jstor.org
[59] WANG, W. 1997. Optimal unbiased tests for equivalence intrasubject variability. J. Amer. Statist. Assoc. 92 1163 1170. Z. JSTOR: · Zbl 1067.62585 · doi:10.2307/2965582 · links.jstor.org
[60] WANG, W., HWANG, J. T. G. and DASGUPTA, A. 1999. Statistical tests for multivariate bioequivalence. Biometrika 86 395 402. Z. JSTOR: · Zbl 1054.62611 · doi:10.1093/biomet/86.2.395 · links.jstor.org
[61] WANG, Y. and MCDERMOTT, M. P. 1998a. Conditional likelihood ratio test for a nonnegative normal mean vector. J. Amer. Statist. Assoc. 93 380 386. Z. JSTOR: · Zbl 0953.62055 · doi:10.2307/2669634 · links.jstor.org
[62] WANG, Y. and MCDERMOTT, M. P. 1998b. A conditional test for a nonnegative mean vector based on a Hotelling’s T 2-type statistic. J. Multivariate Anal. 66 64 70. Z. · Zbl 1138.62329 · doi:10.1006/jmva.1997.1736
[63] WARRACK, G. and ROBERTSON, T. 1984. A likelihood ratio test regarding two nested but oblique order-restricted hypotheses. J. Amer. Statist. Assoc. 79 881 886. Z. JSTOR: · Zbl 0549.62021 · doi:10.2307/2288719 · links.jstor.org
[64] WILKS, S. S. 1938. The large-sample distribution of the likelihood ratio for testing composite hypotheses. Ann. Math. Statist. 9 60 62. Z. · Zbl 0018.32003 · doi:10.1214/aoms/1177732360
[65] WILKS, S. S. 1962. Mathematical Statistics. Wiley, New York. Z. · Zbl 0173.45805
[66] ZELTERMAN, D. 1990. On tests for qualitative interactions. Statist. Probab. Lett. 10 59 63. · Zbl 0698.62103 · doi:10.1016/0167-7152(90)90112-K
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.