×

Mean and cell-mean imputation resulting in the use of a semicontinuous distribution and a mixture of semicontinuous distributions. (English) Zbl 1384.62069

Summary: Zero-inflated models are commonly used for modeling count and continuous data with extra zeros. Inflations at one point or two points apart from zero for modeling continuous data have been discussed less than that of zero inflation. In this article, inflation at an arbitrary point \(\alpha\) as a semicontinuous distribution is presented and the mean imputation for a continuous response is discussed as a cause of having semicontinuous data. Also, inflation at two points and generally at \(k\) arbitrary points and their relation to cell-mean imputation in the mixture of continuous distributions are studied. To analyze the imputed data, a mixture of semicontinuous distributions is used. The effects of covariates on the dependent variable in a mixture of \(k\) semicontinuous distributions with inflation at \(k\) points are also investigated. In order to find the parameter estimates, the method of expectation-maximization (EM) algorithm is used. In a real data of Iranian Households Income and Expenditure Survey (IHIES), it is shown how to obtain a proper estimate of the population variance when continuous missing at random responses are mean imputed.

MSC:

62F10 Point estimation
62P20 Applications of statistics to economics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Arulampalam, W., and A. Booth. 1997. Who gets over the training hurdle? a study of the training experiences of young men and women in Britain. Journal of Population Economics 10:197-217.
[2] Bae, S., F. Famoye, J. T. Wulu, A. A. Bartolucci, and K. P. Singh. 2005. A rich family of generalized Poisson regression models. Mathematics and Computers in Simulation 69 (1-2):4-11. · Zbl 1065.62151
[3] Berk, K. N., and P. A. Lachenbruch. 2002. Repeated measures with zeros. Statistical Methods In Medical Research 11 (4):303-316. · Zbl 1121.62574
[4] Cameron, C., and P. Trivedi. 1998. Regression analysis of count data. New York: Cambridge University Press. · Zbl 0924.62004
[5] Diggle, P., and M. G. Kenward. 1994. Informative drop-out in longitudinal data analysis (with discussion). Applied Statistics 43:49-93. · Zbl 0825.62010
[6] Duan, N., W. G. Manning, C. N. Morris, and J. P. Newhouse. 1983. A comparison of alternative models for the demand for medical care. J. Bus. Econ. Stat. 1:115-126.
[7] Feller, W.1945. On a general class of contagious distributions. Annals of Mathematical Statistics 12:389-400. · Zbl 0063.01341
[8] Gupta, P. L., R. C. Gupta, and R. C. Tripathi. 2004. Score test for zero infated generalized Poisson regression model. Communication in Statistics: Theory and Methods 33 (1):47-64. · Zbl 1185.62041
[9] Hallstrom, A. P.2010. A modified Wilcoxon test for non-negative distributions with a clump of zeros. Statistics in Medicine 29 (3):391-400.
[10] Lambert, D.1992. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34:1-14. · Zbl 0850.62756
[11] Lee, A. H., Y. Zhao, K. K. Yau, and L. Xiang. 2010. How to analyze longitudinal multilevel physical activity data with many zeros?Preventive Med. 51 (6):476-481.
[12] Little, R. J.1995. Modeling the dropout mechanism in repeated-measures studies. Journal of the American Statistical Association 90:1112-1121. · Zbl 0841.62099
[13] Little, R. J. A., and D. B. Rubin. 2002. Statistical Analysis with Missing Data, 2nd ed. New York: John Wiley. · Zbl 1011.62004
[14] Long, J. S.1997. Regression models for categorical and limited dependent variables. Thousand Oaks, CA: Sage, London. · Zbl 0911.62055
[15] McCullagh, P., and J. A. Nelder. 1989. Generalized linear models. New York: Chapman and Hall. · Zbl 0744.62098
[16] Mersad, M., M. Ganjali, and F. Rivaz2015. Some extensions of zero-inflated models and Bayesian tests for them. Journal of Statistical Computation and Simulation 1:1-19.
[17] Mills, E. D.2013. Adjusting for covariates in zero-inflated gamma and zero-inflated log-normal models for semicontinuous data, Thesis and dissertations, University of Lowa.
[18] Mullahy, J.1986. Specification and testing of some modified count data models. Journal of Econometrics 33:341-365.
[19] Neyman, J.1939. On a new class of contagious distributions applicable in ento-mology and bacteriology. Annals of Mathematical Statistics 10:35-57. · Zbl 0020.38203
[20] Olsen, M. K., and J. L. Schafer. 2001. A two-part random-effects model for semi-continuous longitudinal data. Journal of the American Statistical Association 96:730-745. · Zbl 1017.62064
[21] Ospina, R., and S. L. P. Ferrari. 2012. A general class of zero-or-one inflated beta regression models. Computational Statistics and Data Analysis 56 (6):1609-1623. · Zbl 1243.62099
[22] Shankar, V., J. Milton, and F. Mannering. 1997. Modeling accident frequencies as zero-altered probability processes: an empirical inquiry. Accident Analysis and Prevention 29:829-837.
[23] Taylor, S., and K. Pollard. 2009. Hypothesis tests for point-mass mixture data with application to omics data with many zero values. Statistical Applications In Genetics and Molecular Biology. 8 (1) article 8. DOI: https://doi.org/10.2202/1544-6115.1425 · Zbl 1276.92074 · doi:10.2202/1544-6115.1425
[24] Tobin, J.1958. Estimation of relationships for limited dependent variables. Econometrica 26:24-36. · Zbl 0088.36607
[25] Tooze, J. A., G. K. Grunwald, and R. H. Jones. 2002. Analysis of repeated measures data with clumping at zero. Statistical Methods In Medical Research 11 (4):341-355. · Zbl 1121.62674
[26] Tu, W., and X.-H. Zhou. 1999. A Wald test comparing medical costs based on log-normal distributions with zero valued costs. Statistics in Medicine 18 (20):2749-2761.
[27] Yip, P.1988. Inference about the mean of a Poisson distribution in the presence of a nuisance parameter. Australian Journal of Statistics 30:299-306. · Zbl 0707.62053
[28] Yongyi, M., and A. Agresti. 2002. Modeling nonnegative data with clumping at Zero: A survey. International Official Journal of the Iranian Statistical Society 1 (1-2):7-33. · Zbl 1403.62117
[29] Zhou, X.-H., and W. Tu. 1999. Comparison of several independent population means when their samples contain log-normal and possibly zero observations. Biometrics 55 (2):645-651. · Zbl 1059.62518
[30] Zhou, X. H., and W. Z. Tu. 2000. Interval estimation for the ratio in means of log-normally distributed medical costs with zero values. Computational Statistics and Data Analysis 35 (2):201-210. · Zbl 1115.62302
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.