×

Hypothesis testing near singularities and boundaries. (English) Zbl 1425.62033

Typically, the asymptotic \(\chi^2\) distribution of the likelihood ratio statistic is used as an approximation in hypothesis testing. However, this approximation breaks down at (and near) boundaries and singularities of the relevant parameter spaces. In the present paper, the authors propose an alternative approximation for the distribution of the likelihood ratio statistic, and give detailed investigation of this approximation in two models in the setting of evolutionary trees. This includes investigation of the regions of the parameter space in which the standard \(\chi^2\) approximation is of a sufficiently high quality. The distribution the authors propose depends on both the sample size and the true parameter values; in light of this, they also investigate (using simulations) hypothesis testing in the presence of nuisance parameters.

MSC:

62E17 Approximations to statistical distributions (nonasymptotic)
62F03 Parametric hypothesis testing
62F05 Asymptotic properties of parametric tests
92D15 Problems related to evolution
62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

DLMF

References:

[1] Allman, E. S., Degnan, J. H. and Rhodes, J. A. (2011). Identifying the Rooted Species Tree from the Distribution of Unrooted Gene Trees under the Coalescent., J. Math Biol.6 833-862. · Zbl 1230.92033 · doi:10.1007/s00285-010-0355-7
[2] Andrews, D. W. K. (2000). Inconsistency of the bootstrap when a parameter is on the boundary of the parameter space., Econometrica68 399-405. · Zbl 1015.62044 · doi:10.1111/1468-0262.00114
[3] Andrews, D. W. K. and Guggenberger, P. (2009a). Hybrid and size-corrected subsampling methods., Econometrica77 721-762. · Zbl 1176.62117 · doi:10.3982/ECTA7015
[4] Andrews, D. W. K. and Guggenberger, P. (2009b). Incorrect Asymptotic size of subsampling procedures based on post-consistent model selection estimators., J. Econometrics152 19-27. · Zbl 1431.62203 · doi:10.1016/j.jeconom.2009.02.001
[5] Andrews, D. and Guggenberger, P. (2010). Asymptotic size and a problem with subsampling and with the m out of n bootstrap., Econometric Theory26 426-468. · Zbl 1185.62044 · doi:10.1017/S0266466609100051
[6] Bartolucci, F. (2006). Likelihood inference for a class of latent Markov models under linear hypotheses on the transition probabilities., J. R. Statist. Soc. B68 155-178. · Zbl 1100.62078 · doi:10.1111/j.1467-9868.2006.00538.x
[7] Bartolucci, F., Forcina, A. and Dardanoni, V. (2001). Positive quadrant dependence and marginal modeling in two-way tables with ordered margins., J. Am. Stat. Assoc.96 1497-1505. · Zbl 1073.62542 · doi:10.1198/016214501753382390
[8] Berger, R. L. and Boos, D. D. (1994). P values maximized over a confidence set for nuisence parameters., J. Am. Stat. Assoc.89 1012-1016. · Zbl 0804.62018
[9] Chernoff, H. (1954). On the distribution of the likelihood ratio., The Annals of Mathematical Statistics25 573-578. · Zbl 0056.37102 · doi:10.1214/aoms/1177728725
[10] Cressie, N. A. and Read, T. R. (1984). Multinomial Goodness-of-Fit Tests., Journal of the Royal Statistical Society. Series B (Methodological) 440-464. · Zbl 0571.62017 · doi:10.1111/j.2517-6161.1984.tb01318.x
[11] Cressie, N. A. and Read, T. R. (1989). Pearson’s \(X^2\) and the Loglikelihood Ratio Statistic \(G^2\): A Comparative Review., International Statistical Review/Revue Internationale de Statistique 19-43. · Zbl 0707.62105 · doi:10.2307/1403315
[12] Degnan, J. H. and Rosenberg, N. A. (2009). Gene tree discordance, phylogenetic inference and the multispecies coalescent., Trends in Ecology & Evolution24 332-340.
[13] Drton, M. (2009). Likelihood Ratio Tests and Singularities., The Annals of Statistics 979-1012. · Zbl 1196.62020 · doi:10.1214/07-AOS571
[14] Durand, E. Y., Patterson, N., Reich, D. and Slatkin, M. (2011). Testing for Ancient Admixture between Closely Related Populations., Mol Biol Evol.28 2239-2252.
[15] Florescu, I. (2014)., Probability and Stochastic Processes. John Wiley & Sons. · Zbl 1303.60001
[16] Gaither, J. and Kubatko, L. (2016). Hypothesis tests for phylogenetic quartets, with applications to coalescent-based species tree inference., Journal of Theoretical Biology408 179-186. · Zbl 1352.92102 · doi:10.1016/j.jtbi.2016.08.013
[17] Green, R. E., Krause, J., Briggs, A. W., Maricic, T., Stenzel, U. et al. (2010). A Draft Sequence of the Neandertal Genome., Science328 710-722.
[18] Gu, X. and Li, W.-H. (1996). Bias-corrected paralinear and LogDet distances and tests of molecular clocks and phylogenies under nonstationary nucleotide frequencies., Molecular Biology and Evolution13 1375-1383.
[19] Massingham, T. and Goldman, N. (2007). Statistics of the log-det estimator., Molecular Biology and Evolution24 2277-2285.
[20] McCloskey, A. (2017). Bonferroni-based size-correction for nonstandard testing problems., J. Econometrics200 17-35. · Zbl 1388.62371 · doi:10.1016/j.jeconom.2017.05.001
[21] Miller, J. J. (1977). Asymptotic properties of maximum likelihood estimates in the mixed model of the analysis of variance., The Annals of Statistics 746-762. · Zbl 0406.62017 · doi:10.1214/aos/1176343897
[22] Neyman, J. and Pearson, E. S. (1933). On the Problem of the Most Efficient Tests of Statistical Hypotheses., Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character231 289-337. · JFM 59.1163.02 · doi:10.1098/rsta.1933.0009
[23] Olver, F. W. (2010)., NIST handbook of mathematical functions hardback and CD-ROM. Cambridge University Press. · Zbl 1198.00002
[24] Pamilo, P. and Nei, M. (1988). Relationships between gene trees and species trees., Mol Biol Evol.5 568-583.
[25] Rannala, B. and Yang, Z. (2003). Bayes Estimation of Species Divergence Times and Ancestral Population Sizes Using DNA Sequences From Multiple Loci., Genetics164 1645-1656.
[26] Rosenberg, N. A. (2002). The probability of topological concordance of gene trees and species trees., Theoretical Population Biology61 225-247. · Zbl 1040.92032 · doi:10.1006/tpbi.2001.1568
[27] Self, S. G. and Liang, K.-Y. (1987). Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests Under Nonstandard Conditions., Journal of the American Statistical Association82 605-610. · Zbl 0639.62020 · doi:10.1080/01621459.1987.10478472
[28] Shapiro, A. (1985). Asymptotic distribution of test statistics in the analysis of moment structures under inequality constraints., Biometrika72 133-144. · Zbl 0596.62019 · doi:10.1093/biomet/72.1.133
[29] Silvapulle, M. J. and Sen, P. K. (2001)., Constrained Statistical Inference: Inequality, Order, and Shape Restrictions. Wiley. · Zbl 1077.62019
[30] Van der Vaart, A. W. (2000)., Asymptotic Statistics3. Cambridge University Press. · Zbl 0910.62001
[31] Wakeley, J. (2009). Coalescent theory: an introduction., Roberts & Company. · Zbl 1366.92001
[32] Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses., The Annals of Mathematical Statistics9 60-62. · Zbl 0018.32003 · doi:10.1214/aoms/1177732360
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.