## Maximum likelihood estimation in the logistic regression model with a cure fraction.(English)Zbl 1274.62480

Summary: Logistic regression is widely used in medical studies to investigate the relationship between a binary response variable $$Y$$ and a set of potential predictors $$\mathbf X$$. The binary response may represent, for example, the occurrence of some outcome of interest ($$Y=1$$ if the outcome occurred and $$Y=0$$ otherwise). In this paper, we consider the problem of estimating the logistic regression model with a cure fraction. A sample of observations is said to contain a cure fraction when a proportion of the study subjects (the so-called cured individuals, as opposed to the susceptibles) cannot experience the outcome of interest. One problem arising then is that it is usually unknown who are the cured and the susceptible subjects, unless the outcome of interest has been observed. In this setting, a logistic regression analysis of the relationship between $$\mathbf X$$ and $$Y$$ among the susceptibles is no more straightforward. We develop a maximum likelihood estimation procedure for this problem, based on the joint modeling of the binary response of interest and the cure status. We investigate the identifiability of the resulting model. Then, we establish the consistency and asymptotic normality of the proposed estimator, and we conduct a simulation study to investigate its finite-sample behavior.

### MSC:

 62J12 Generalized linear models (logistic models) 62F12 Asymptotic properties of parametric estimators
Full Text:

### References:

 [1] Czado, C. and Santner, T. J. (1992). The effect of link misspecification on binary regression inference., Journal of Statistical Planning and Inference 33 , 213-231. · Zbl 0781.62037 · doi:10.1016/0378-3758(92)90069-5 [2] Dietz, E. and Böhning, D. (2000). On estimation of the Poisson parameter in zero-modified Poisson models., Computational Statistics & Data Analysis 34 , 441-459. · Zbl 1046.62085 [3] Dussart, P., Baril, L., Petit, L., Beniguel, L., Quang, L. C., Ly, S., do Socorro Azevedo, R., Meynard, J.-B., Vong, S., Chartier, L., Diop, A., Sivuth, O., Duong, V., Thang, C. M., Jacobs, M., Sakuntabhai, A., Texeira Nunes, M. R., Que Huong, V. T., Buchy, P. and Vasconcelos, P. F. (2011). Study of dengue cases and the members of their households: a familial cluster analysis in the multinational DENFRAME project., Submitted . [4] Eicker, F. (1966). A multivariate central limit theorem for random linear vector forms., Annals of Mathematical Statistics 37 , 1825-1828. · Zbl 0168.16903 · doi:10.1214/aoms/1177699175 [5] Fahrmeir, L. and Kaufmann, H. (1985). Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models., The Annals of Statistics 13 , 342-368. · Zbl 0594.62058 · doi:10.1214/aos/1176346597 [6] Famoye, F. and Singh, K. P. (2006). Zero-inflated generalized Poisson regression model with an application to domestic violence data., Journal of Data Science 4 , 117-130. [7] Fang, H.-B., Li, G. and Sun, J. (2005). Maximum likelihood estimation in a semiparametric logistic/proportional-hazards mixture model., Scandinavian Journal of Statistics 32 , 59-75. · Zbl 1087.62109 · doi:10.1111/j.1467-9469.2005.00415.x [8] Follmann, D. A. and Lambert, D. (1991). Identifiability of finite mixtures of logistic regression models., Journal of Statistical Planning and Inference 27 , 375-381. · Zbl 0717.62061 · doi:10.1016/0378-3758(91)90050-O [9] Gouriéroux, C. and Monfort, A. (1981). Asymptotic properties of the maximum likelihood estimator in dichotomous logit models., Journal of Econometrics 17 , 83-97. · Zbl 0481.62029 · doi:10.1016/0304-4076(81)90060-9 [10] Guyon, X. (2001)., Statistique et économétrie - Du modèle linéaire aux modèles non-linéaires . Ellipses Marketing. [11] Hall, D. B. (2000). Zero-inflated Poisson and binomial regression with random effects: a case study., Biometrics 56 , 1030-1039. · Zbl 1060.62535 · doi:10.1111/j.0006-341X.2000.01030.x [12] Hilbe, J. M. (2009)., Logistic regression models . Chapman & Hall: Boca Raton. · Zbl 1169.62066 [13] Hosmer, D.W. and Lemeshow, S. (2000)., Applied logistic regression . Wiley: New York. · Zbl 0967.62045 [14] Huang, J., Ma, S. and Zhang C. H. (2008). The iterated lasso for high-dimensional logistic regression., Technical report No. 392, The University of Iowa . [15] Kelley, M. E. and Anderson, S. J. (2008). Zero inflation in ordinal data: incorporating susceptibility to response through the use of a mixture model., Statistics in Medicine 27 , 3674-3688. · doi:10.1002/sim.3267 [16] Lam, K. F., Xue, H. and Cheung, Y. B. (2006). Semiparametric analysis of zero-inflated count data., Biometrics 62 , 996-1003. · Zbl 1117.62125 · doi:10.1111/j.1541-0420.2006.00575.x [17] Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing., Technometrics 34 , 1-14. · Zbl 0850.62756 · doi:10.2307/1269547 [18] Lee, A. H., Wang, K., Scott, J. A., Yau, K. K. W. and McLachlan, G. J. (2006). Multi-level zero-inflated Poisson regression modelling of correlated count data with excess zeros., Statistical Methods in Medical Research 15 , 47-61. · Zbl 1152.62083 · doi:10.1191/0962280206sm429oa [19] Lu, W. (2008). Maximum likelihood estimation in the proportional hazards cure model., Annals of the Institute of Statistical Mathematics 60 , 545-574. · Zbl 1169.62347 · doi:10.1007/s10463-007-0120-x [20] Lu, W. (2010). Efficient estimation for an accelerated failure time model with a cure fraction., Statistica Sinica 20 , 661-674. · Zbl 1187.62069 [21] Meier, L., van de Geer, S. and Bühlmann, P. (2008). The group Lasso for logistic regression., Journal of the Royal Statistical Society. Series B 70 , 53-71. · Zbl 1400.62276 · doi:10.1111/j.1467-9868.2007.00627.x [22] Ridout, M., Hinde, J. and Demétrio, C. G. B. (2001). A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives., Biometrics 57 , 219-223. · Zbl 1209.62079 · doi:10.1111/j.0006-341X.2001.00219.x [23] Xiang, L., Lee, A. H., Yau, K. K. W. and McLachlan, G. J. (2007). A score test for overdispersion in zero-inflated Poisson mixed regression model., Statistics in Medicine 26 , 1608-1622. · doi:10.1002/sim.2616
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.