Confidence interval construction for disease prevalence based on partial validation series.

*(English)*Zbl 1241.62070Summary: It is desirable to estimate disease prevalence based on data collected by a gold standard test, but such a test is often limited due to cost and ethical considerations. Data with partial validation series thus become an alternative. The construction of confidence intervals for disease prevalence with such data is considered. A total of 12 methods, which are based on two Wald-type test statistics, score test statistic, and likelihood ratio test statistic, are developed. Both asymptotic and approximate unconditional confidence intervals are constructed. Two methods are employed to construct the unconditional confidence intervals: one involves inverting two one-sided tests and the other involves inverting one two-sided test. Moreover, the bootstrapping method is used. Two real data sets are used to illustrate the proposed methods. Empirical results suggest that the 12 methods largely produce satisfactory results, and the confidence intervals derived from the score test statistic and the Wald test statistic with nuisance parameters appropriately evaluated generally outperform the others in terms of coverage. If the interval location or the non-coverage at the two ends of the interval is also of concern, then the aforementioned interval based on the Wald test becomes the best choice.

##### MSC:

62G15 | Nonparametric tolerance and confidence regions |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

62G09 | Nonparametric statistical resampling methods |

92C50 | Medical applications (general) |

65C60 | Computational problems in statistics (MSC2010) |

##### Software:

bootstrap
PDF
BibTeX
XML
Cite

\textit{M.-L. Tang} et al., Comput. Stat. Data Anal. 56, No. 5, 1200--1220 (2012; Zbl 1241.62070)

Full Text:
DOI

##### References:

[1] | Agresti, A.; Min, Y.Y., On small-sample confidence intervals for parameters in discrete distributions, Biometrics, 57, 963-971, (2001) · Zbl 1209.62041 |

[2] | Alonzo, T.A., Estimating disease prevalence in two-phase studies, Biostatistics, 4, 313-326, (2003) · Zbl 1141.62346 |

[3] | Basu, D., On the elimination of nuisance parameters, Journal of the American statistical association, 72, 355-366, (1977) · Zbl 0395.62003 |

[4] | Boese, D.H.; Young, D.M.; Stamey, J.D., Confidence intervals for a binomial parameter based on binary data subject to false-positive misclassification, Computational statistics and data analysis, 50, 3369-3385, (2006) · Zbl 1445.62052 |

[5] | Bross, I., Misclassification in 2×2 tables, Biometrics, 10, 478-486, (1954) · Zbl 0058.13103 |

[6] | Chan, I.S.F.; Tang, N.S.; Tang, M.L.; Chan, P.S., Statistical analysis of noninferiority trials with a rate ratio in small-sample matched-pair designs, Biometrics, 59, 1170-1177, (2003) · Zbl 1274.62741 |

[7] | Chan, I.S.F.; Zhang, Z., Test-based exact confidence intervals for the difference of two binomial proportions, Biometrics, 55, 1202-1209, (1999) · Zbl 1059.62534 |

[8] | Efron, B., Better bootstrap confidence intervals, Journal of the American statistical association, 82, 171-185, (1987) · Zbl 0622.62039 |

[9] | Efron, B.; Tibshirani, R.J., An introduction to the bootstrap, (1993), Chapman & Hall/CRC Boca Raton · Zbl 0835.62038 |

[10] | Geng, Z.; Asano, C., Bayesian estimation methods for categorical data with misclassification, Communications in statistics: theory and methods, 18, 1747-1766, (1989) |

[11] | McGill, R.J.; Tukey, W.; Larsen, W.A., Variations of boxplots, American Statistician, 32, 12-16, (1978) |

[12] | McNamee, R., Efficiency of two-phase design for prevalence estimation, International journal of epidemiology, 32, 1072-1078, (2003) |

[13] | Morvan, J.; Coste, J.; Roux, C.H.; Euller-Ziegler, L.; Saraux, A.; Guillemin, F., Prevalence in two-phase surveys: accuracy of screening procedure and corrected estimates, Annals of epidemiology, 18, 261-269, (2008) |

[14] | Newcombe, R.G., Two-sided confidence intervals for the single proportion: comparison of seven methods, Statistics in medicine, 17, 857-872, (1998) |

[15] | Newcombe, R.G., Interval estimation for the difference between independent proportions: comparison of eleven methods, Statistics in medicine, 17, 873-890, (1998) |

[16] | Newcombe, R.G., Improved confidence intervals for the difference between binomial proportions based on paired data, Statistics in medicine, 17, 2635-2650, (1998) |

[17] | Newcombe, R.G., 2010. Measures of location for confidence intervals for proportions. Communications in Statistics: Theory and Methods (in press). · Zbl 1277.62089 |

[18] | Pepe, M.S., Inference using surrogate outcome data and a validation sample, Biometrika, 79, 355-365, (1992) · Zbl 0751.62049 |

[19] | Poon, W.Y.; Wang, H.B., Analysis of ordinal catoegircal data with misclassification, British journal of mathematical and statistical psychology, 63, 17-42, (2010) |

[20] | Poon, W.Y.; Wang, H.B., Bayesian analysis of multivariate probit models with surrogate outcome data, Psychometrika, 75, 498-520, (2010) · Zbl 1208.62193 |

[21] | Prince, M., Two-phase surveys. A death is announced; no flowers please, International journal of epidemiology, 32, 1078-1080, (2003) |

[22] | Smyth, E.T.M.; McIlvenny, G.; Enstone, J.; Emmerson, A.M.; Humphreys, H.; Fitzpatrick, F.; Davies, E.; Newcombe, R.G.; Spencer, R.C., Four country healthcare associated infection prevalence survey 2006: overview of the results, Journal of hospital infection, 69, 230-248, (2008) |

[23] | Tang, M.L., Poon, W.Y., Ling, L., Lia, Y., Chui, H.W., 2010a. Approximate unconditional test procedure for comparing two ordered multinomials. Computational Statistics and Data Analysis. doi:10.1016/j.csda.2010.08.009. · Zbl 1284.62114 |

[24] | Tang, N.S., Qiu, S.F., Tang, M.L., Pei, Y.B., 2010b. Asymptotic confidence interval construction for proportion difference in medical studies with bilateral data. Statistical Methods in Medical Research. doi:10.1177/0962280209358135. |

[25] | Tang, N.S.; Tang, M.L., Exact unconditional inference for risk ratio in a correlated 2×2 table with structural zero, Biostatistics, 58, 972-980, (2002) · Zbl 1210.62015 |

[26] | Tang, M.L.; Tang, N.S.; Chan, I.S.F., Confidence interval construction for proportion difference in small-sample paired studies, Statistics in medicine, 24, 3565-3579, (2005) |

[27] | Tenenbein, A.A., A double sampling scheme for estimating from binomial data with misclassifications, Journal of the American statistical association, 65, 1350-1361, (1970) |

[28] | Tenenbein, A.A., A double sampling scheme for estimating from binomial data with misclassifications: sample size determination, Biometrics, 27, 935-944, (1971) |

[29] | Tenenbein, A.A., A double sampling scheme for estimating from misclassified multinomial data with applications to sampling inspection, Technometrics, 14, 187-202, (1972) · Zbl 0226.62004 |

[30] | Traub, J.F., Iterative methods for the solution of equations, () · Zbl 0121.11204 |

[31] | Whatley, S.D.; Mason, N.G.; Woolf, J.R.; Newcombe, R.G.; Elder, G.H.; Badminton, M.N., Diagnostic strategies for autosomal dominant acute porphyrias: retrospective analysis of 467 unrelated patients referred for mutational analysis of the HMBS, CPOX or PPOX genes, Clinical chemistry, 55, 1406-1414, (2009) |

[32] | Wilson, E.B., Probable inference, the law of succession, and statistical inference, Journal of the American statistical association, 22, 209-212, (1927) |

[33] | Yiu, C.F.; Poon, W.Y., Estimating the polychoric correlation from misclassified data, British journal of mathematical and statistical psychology, 61, 133-161, (2008) |

[34] | Zou, G.Y., On the estimation of additive interaction by use of the four-by-two table and beyond, American journal of epidemiology, 168, 212-224, (2008) |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.