zbMATH — the first resource for mathematics

Efficient methods for the estimation of the multinomial parameter for the two-trait group testing model. (English) Zbl 1429.62186
Summary: Estimation of a single Bernoulli parameter using pooled sampling is among the oldest problems in the group testing literature. To carry out such estimation, an array of efficient estimators have been introduced covering a wide range of situations routinely encountered in applications. More recently, there has been growing interest in using group testing to simultaneously estimate the joint probabilities of two correlated traits using a multinomial model. Unfortunately, basic estimation results, such as the maximum likelihood estimator (MLE), have not been adequately addressed in the literature for such cases. In this paper, we show that finding the MLE for this problem is equivalent to maximizing a multinomial likelihood with a restricted parameter space. A solution using the EM algorithm is presented which is guaranteed to converge to the global maximizer, even on the boundary of the parameter space. Two additional closed form estimators are presented with the goal of minimizing the bias and/or mean square error. The methods are illustrated by considering an application to the joint estimation of transmission prevalence for two strains of the Potato virus Y by the aphid Myzus persicae.
62H12 Estimation in multivariate analysis
62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI Euclid
[1] Avrahami-Moyal, L., Tam, Y., Brumin, M., Prakash, S., Leibman, D., Pearlsman, M., Bornstein, M., Sela, N., Zeidan, M., Dar, Z., Zig, U., Gal-On, A., and Gaba, V. (2017). Detection of Potato virus Y in industrial quantities of seed potatoes by TaqMan Real Time PCR., Phytoparasitica45 591-598.
[2] Burrows, P. M. (1987). Improved Estimation of Pathogen Transmission Rates by Group Testing., Phytopathology77 363-365.
[3] Ding, J. and Xiong, W. (2015). Robust group testing for multiple traits with misclassification., Journal of Applied Statistics42 2115-2125.
[4] Ding, J. and Xiong, W. (2016). A new estimator for a population proportion using group testing., Communications in Statistics-Simulation and Computation45 101-114. · Zbl 1341.62075
[5] Fletcher, J. D. (2012). A virus survey of New Zealand fresh, process and seed potato crops during 2010-11., New Zealand Plant Protection65 197-203.
[6] Gray, S., De Boer, S., Lorenzen, J., Karazev, A., Whitworth, J., Nolte, P., Singh, R., Boucher, A., and Xu, H. (2010). Potato virus Y: an evolving concern for potato crops in the United States and Canada., Plant Disease94 1384-1397.
[7] Grendár, M. and Špitalský, V. (2017). Multinomial and empirical likelihood under convex constraints: Directions of recession, Fenchel duality, the PP algorithm., Electronic Journal of Statistics11 2547-2612. · Zbl 1366.62109
[8] Haber, G. and Malinovsky, Y. (2017). Random walk designs for selecting pool sizes in group testing estimation with small samples., Biometrical Journal59 1382-1398. · Zbl 1379.62071
[9] Haber, G. and Malinovsky, Y. (2018). On the construction of unbiased estimators for the group testing problem., Sankhya A. https://doi.org/10.1007/s13171-018-0156-4.
[10] Haber, G., Malinovsky, Y., and Albert, P. S. (2018). Sequential estimation in the group testing problem., Sequential Analysis37 1-17. · Zbl 1390.62163
[11] Hepworth, G. and Watson, R. (2009). Debiased estimation of proportions in group testing., Journal of Royal Statistical Society, Series C58 105-121.
[12] Hughes-Oliver, J. M. and Rosenberger, W. (2000). Efficient estimation of the prevalence of multiple rare traits., Biometrika87 315-327. · Zbl 1066.62533
[13] Hughes-Oliver, J. M. and Swallow, W. H. (1994). A two-stage adaptive group testing procedure for estimating small proportions., Journal of the American Statistical Association89 982-993. · Zbl 0804.62094
[14] Hyun, N., Gastwirth, J. L., Graubard, B. I. (2018). Grouping methods for estimating prevalences of rare traits for complex survey data that preserve confidentiality of respondents., Statistics in Medicine37 2174-2186.
[15] Jamshidian, M. (2004). On algorithms for restricted maximum likelihood estimation., Computational Statistics and Data Analysis45 137-157. · Zbl 1429.62104
[16] Li, Q., Liu, A., and Xiong, W. (2017). D-Optimality of group testing for joint estimation of correlated rare diseases with misclassification., Statistica Sinica27 823-838. · Zbl 1391.62230
[17] Liu, S. C., Chiang, K. S., Lin, C. H., Chung, W. C., Lin, S. H., and Yang, T. C. (2011). Cost analysis in choosing group size when group testing for Potato virus Y in the presence of classification errors., Annals of Applied Biology159 491-502.
[18] Liu, A., Liu, C., Zhang, Z., and Albert, P. S. (2012). Optimality of group testing in the presence of misclassification., Biometrika99 245-251. · Zbl 1234.62145
[19] Lorenzen, J. H., Piche, L. M., Gudmestad, N. C., Meacham, T., and Shiel, P. (2006). A multiplex PCR assay to characterize potato virus Y isolates and identify strain mixtures., Plant Disease90 935-940.
[20] Mallik, I., Anderson, N. R., and Gudmestad, N. C. (2012). Detection and differentiation of Potato Virus Y strains from potato using immunocapture multiples RT-PCR., American Journal of Potato Research89 184-191.
[21] Mello, A. F. S., Olarte, R. A., Gray, S. M., and Perry, K. L. (2011). Transmission efficiency of Potato virus Y strains, \(PVY^O\) and \(PVY^{N-Wi}\) by five aphid species. Plant Disease95 1279-1283.
[22] Mondal, S., Lin, Y., Carroll, J. E., Wenninger, E. J., Bosque-Perez, N. A., Whitworth, J. L., Hutchinson, P., Eigenbrode, S., and Gray, S. M. (2017). Potato virus Y transmission efficiency from potato infected with single or multiple virus strains., Phytopathology107 491-498.
[23] Nelder, J. A. and Mead, R. (1965). A simplex method for function minimization., The Computer Journal7 308-313. · Zbl 0229.65053
[24] Nettleton, D. (1999). Convergence properties of the EM Algorithm in constrained parameter spaces., Canadian Journal of Statistics27 639-648. · Zbl 0942.62033
[25] Pfeiffer, R. M., Rutter, J. L., Gail, M. H., Struewing, J., and Gastwirth, J. L. (2002). Efficiency of DNA pooling to estimate joint allele frequencies and measure linkage disequilibrium., Genetic Epidemiology22 94-102.
[26] Santos, J. D. and Dorgman, D. (2016). An approximate likelihood estimator for the prevalence of infections in vectors using pools of varying sizes., Biometrical Journal58 1248-1256. · Zbl 1358.62106
[27] Swallow, W. H. (1985). Group Testing for Estimating Infection Rates and Probabilities of Disease Transmission., Phytopathology75 882-889.
[28] Tebbs, J. M., Bilder, C. R., and Koser, B. K. (2003). An empirical Bayes group-testing approach to estimating small proportions., Communications in Statistics - Theory and Methods32 983-995. · Zbl 1100.62535
[29] Tebbs, J. M., McMahan, C. S., and Bilder, C. R. (2013). Two-stage hierarchical group testing for multiple infections with application to the infertility prevention project., Biometrics69 1064-1073. · Zbl 1288.62169
[30] Thompson, K. H. (1962). Estimation of the proportion of vectors in a natural population of insects, Biometrics18 568-578.
[31] Tu, X. M., Litvak, E., and Pagano, M. (1995). On the informativeness and accuracy of pooled testing in estimating prevalence of a rare disease: application to HIV screening., Biometrika82 287-297. · Zbl 0823.62095
[32] Warasi, M. S., Tebbs, J. M., McMahan, C. S., and Bilder, C. R. (2016). Estimating the prevalence of multiple diseases from two-stage hierarchical pooling., Statistics In Medicine35 3851-3864.
[33] Wu, C. F. (1983). On the convergence properties of the EM algorithm., The Annals of Statistics11 95-103. · Zbl 0517.62035
[34] Zhang, Z., Liu, C., Kim, S., and Liu, A. (2014). Prevalence estimation subject to misclassification: the mis-substitution bias and some remedies., Statistics in Medicine33 4482-4500.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.