×

Consistent estimation of mixture complexity. (English) Zbl 1043.62023

Summary: The consistent estimation of mixture complexity is of fundamental importance in many applications of finite mixture models. An enormous body of literature exists regarding the application, computational issues and theoretical aspects of mixture models when the number of components is known, but estimating the unknown number of components remains an area of intense research effort. This article presents a semiparametric methodology yielding almost sure convergence of the estimated number of components to the true but unknown number of components. The scope of application is vast, as mixture models are routinely employed across the entire diverse application range of statistics, including nearly all of the social and experimental sciences.

MSC:

62G05 Nonparametric estimation
65C60 Computational problems in statistics (MSC2010)
62G07 Density estimation
60F05 Central limit and other weak theorems
Full Text: DOI

References:

[1] Barron, A. R. and Cover, T. M. (1991). Minimum Hellinger distance estimates for parametric models. IEEE Trans. Inform Theory 37 1034-1054. · Zbl 0743.62003 · doi:10.1109/18.86996
[2] Beran, R. (1977). Minimum Hellinger distance estimates for parametric models. Ann. Statist. 5 445-463. · Zbl 0381.62028 · doi:10.1214/aos/1176343842
[3] Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press. · Zbl 0786.62001
[4] Cao, R., Cuevas, A. and Fraiman, R. (1995). Minimum distance density-based estimation. Comput. Statist. Data Anal. 20 611-631. · Zbl 0875.62157 · doi:10.1016/0167-9473(94)00065-4
[5] Cao, R. and Devroye, L. (1996). The consistency of a smoothed minimum distance estimate. Scand. J. Statist. 23 405-418. · Zbl 0898.62045
[6] Chen, J. and Kalbfleisch. J. D. (1996). Penalized minimum distance estimates in finite mixture models. Canad. J. Statist. 24 167-175. JSTOR: · Zbl 0858.62019 · doi:10.2307/3315623
[7] Clarke, B. R. and Heathcote, C. R. (1994). Robust estimation of k-component univariate normal mixtures. Ann. Inst. Statist. Math. 46 83-93. · Zbl 0802.62039 · doi:10.1007/BF00773595
[8] Cordero-Bra na, O. I and Cutler, A. (2001). On the asymptotic properties of the minimum Hellinger estimation in the case of a mixture model. Research Report 7/01/104, Dept. Mathematics and Statistics, Utah State Univ.
[9] Cutler, A. and Cordero-Bra na, O. I. (1996). Minimum Hellinger distance estimation for finite mixture models. J. Amer. Statist. Assoc. 91 1716-1723. JSTOR: · Zbl 0881.62035 · doi:10.2307/2291601
[10] Dacunha-Castelle, D. and Gassiat, E. (1997). The estimation of the order of a mixture model. Bernoulli 3 279-299. · Zbl 0889.62012 · doi:10.2307/3318593
[11] Dacunha-Castelle, D. and Gassiat, E. (1999). Testing the order of a model using locally conic parameterization: population mixtures and stationary ARMA processes. Ann. Statist. 27 1178-1209. · Zbl 0957.62073 · doi:10.1214/aos/1017938921
[12] Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577-588. JSTOR: · Zbl 0826.62021 · doi:10.2307/2291069
[13] Henna, J. (1985). On estimating of the number of constituents of a finite mixture of continuous distributions. Ann. Inst. Statist. Math. 37 235-240. · Zbl 0577.62031 · doi:10.1007/BF02481094
[14] Keribin, C. (2000). Consistent estimation of the order of mixture models. Sankhy \?a Ser. A 62 49-62. · Zbl 1081.62516
[15] Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann. Math. Statist. 27 887-906. · Zbl 0073.14701 · doi:10.1214/aoms/1177728066
[16] Leroux, B. G. (1992). Consistent estimation of a mixing distribution. Ann. Statist. 20 1350-1360. · Zbl 0763.62015 · doi:10.1214/aos/1176348772
[17] Marchette, D. J., Priebe, C. E., Rogers, G. W. and Solka, J. L. (1996). The filtered kernel estimator. Comp. Statist. 11 95-112. · Zbl 0933.62027
[18] Marron, J. S. and Schmitz, H.-P. (1992). Simultaneous density estimation of several income distributions. Econometric Theory 8 476-488.
[19] Marron, J. S. and Wand, M. P. (1992). Exact mean integrated squared error. Ann. Statist. 20 712-736. · Zbl 0746.62040 · doi:10.1214/aos/1176348653
[20] McLachlan, G. J. (1987). On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Appl. Statist. 36 318-324.
[21] Nolan, D. and Marron, J. S. (1989). Uniform consistency of automatic and location adaptive delta sequence estimators. Probab. Theory Related Fields 80 619-632. · Zbl 0644.62041 · doi:10.1007/BF00318909
[22] Pfanzagl, J. (1988). Consistency of maximum likelihood estimators for certain nonparametric families, in particular: mixtures. J. Statist. Plann. Inference 19 137-158. · Zbl 0656.62044 · doi:10.1016/0378-3758(88)90069-9
[23] Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York. · Zbl 0544.60045
[24] Priebe, C. E. and Marchette, D. J. (2000). Alternating kernel and mixture density estimates. Comput. Statist. Data Anal. 35 43-65. · Zbl 1142.62338 · doi:10.1016/S0167-9473(00)00003-7
[25] Redner, R. A. (1981). Note on the consistency of the maximum likelihood estimate for nonidentifiable distributions. Ann. Statist. 9 225-228. · Zbl 0453.62021 · doi:10.1214/aos/1176345353
[26] Redner, R. A. and Walker, H. F. (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26 195-239. JSTOR: · Zbl 0536.62021 · doi:10.1137/1026034
[27] Rissanen, J. (1978). Modeling by shortest data description. Automatica 14 465-471. · Zbl 0418.93079 · doi:10.1016/0005-1098(78)90005-5
[28] Ritov, Y. and Bickel, P. J. (1990). Achieving information bounds in nonand semiparametric models. Ann. Statist. 18 925-938. · Zbl 0722.62025 · doi:10.1214/aos/1176347633
[29] Roeder, K. and Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. J. Amer. Statist. Assoc. 92 894-902. JSTOR: · Zbl 0889.62021 · doi:10.2307/2965553
[30] Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall, New York. · Zbl 0617.62042
[31] Tamura, R. N. and Boos, D. D. (1986). Minimum Hellinger distance estimation for multivariate location and covariance. J. Amer. Statist. Assoc. 81 223-229. JSTOR: · Zbl 0601.62051 · doi:10.2307/2287994
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.