×

zbMATH — the first resource for mathematics

An overview of semiparametric extensions of finite mixture models. (English) Zbl 1429.62272
Summary: Finite mixture models have offered a very important tool for exploring complex data structures in many scientific areas, such as economics, epidemiology and finance. Semiparametric mixture models, which were introduced into traditional finite mixture models in the past decade, have brought forth exciting developments in their methodologies, theories, and applications. In this article, we not only provide a selective overview of the newly-developed semiparametric mixture models, but also discuss their estimation methodologies, theoretical properties if applicable, and some open questions. Recent developments are also discussed.

MSC:
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62G08 Nonparametric regression and quantile regression
Software:
logcondens.mode
PDF BibTeX XML Cite
Full Text: DOI Euclid
References:
[1] Al Mohamad, D. and Boumahdaf, A. (2018). Semiparametric two-component mixture models when one component is defined through linear constraints. IEEE Trans. Inform. Theory 64 795-830. · Zbl 1464.62245
[2] Allman, E. S., Matias, C. and Rhodes, J. A. (2009). Identifiability of parameters in latent structure models with many observed variables. Ann. Statist. 37 3099-3132. · Zbl 1191.62003
[3] Balabdaoui, F. (2017). Revisiting the Hodges-Lehmann estimator in a location mixture model: Is asymptotic normality good enough? Electron. J. Stat. 11 4563-4595. · Zbl 1380.62230
[4] Balabdaoui, F. and Doss, C. R. (2018). Inference for a two-component mixture of symmetric distributions under log-concavity. Bernoulli 24 1053-1071. · Zbl 1419.62059
[5] Benaglia, T., Chauveau, D. and Hunter, D. R. (2009). An EM-like algorithm for semi- and nonparametric estimation in multivariate mixtures. J. Comput. Graph. Statist. 18 505-526. · Zbl 1414.62119
[6] Bordes, L., Chauveau, D. and Vandekerkhove, P. (2007). A stochastic EM algorithm for a semiparametric mixture model. Comput. Statist. Data Anal. 51 5429-5443. · Zbl 1445.62056
[7] Bordes, L., Delmas, C. and Vandekerkhove, P. (2006). Semiparametric estimation of a two-component mixture model where one component is known. Scand. J. Stat. 33 733-752. · Zbl 1164.62331
[8] Bordes, L., Kojadinovic, I. and Vandekerkhove, P. (2013). Semiparametric estimation of a two-component mixture of linear regressions in which one component is known. Electron. J. Stat. 7 2603-2644. · Zbl 1294.62151
[9] Bordes, L., Mottelet, S. and Vandekerkhove, P. (2006). Semiparametric estimation of a two-component mixture model. Ann. Statist. 34 1204-1232. · Zbl 1112.62029
[10] Bordes, L. and Vandekerkhove, P. (2010). Semiparametric two-component mixture model with a known component: An asymptotically normal estimator. Math. Methods Statist. 19 22-41. · Zbl 1282.62068
[11] Butucea, C., Ngueyep Tzoumpe, R. and Vandekerkhove, P. (2017). Semiparametric topographical mixture models with symmetric errors. Bernoulli 23 825-862. · Zbl 1384.62211
[12] Butucea, C. and Vandekerkhove, P. (2014). Semiparametric mixtures of symmetric distributions. Scand. J. Stat. 41 227-239. · Zbl 1349.62094
[13] Cao, J. and Yao, W. (2012). Semiparametric mixture of binomial regression with a degenerate component. Statist. Sinica 22 27-46. · Zbl 1417.62054
[14] Chang, G. T. and Walther, G. (2007). Clustering with mixtures of log-concave distributions. Comput. Statist. Data Anal. 51 6242-6251. · Zbl 1445.62141
[15] Chauveau, D., Hunter, D. R. and Levinez, M. (2015). Estimation for conditional independence multivariate finite mixture models. Stat. Surv. 9 1-31. · Zbl 1307.62090
[16] Chee, C.-S. and Wang, Y. (2013). Estimation of finite mixtures with symmetric components. Stat. Comput. 23 233-249. · Zbl 1322.62013
[17] Chen, H., Chen, J. and Kalbfleisch, J. D. (2004). Testing for a finite mixture model with two components. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 95-115. · Zbl 1061.62025
[18] Chen, J. and Li, P. (2009). Hypothesis test for normal mixture models: The EM approach. Ann. Statist. 37 2523-2542. · Zbl 1173.62007
[19] Dacunha-Castelle, D. and Gassiat, E. (1999). Testing the order of a model using locally conic parametrization: Population mixtures and stationary ARMA processes. Ann. Statist. 27 1178-1209. · Zbl 0957.62073
[20] Dannemann, J., Holzmann, H. and Leister, A. (2014). Semiparametric hidden Markov models: Identifiability and estimation. Comput. Statist. 6 418-425.
[21] De Castro, Y., Gassiat, É. and Le Corff, S. (2017). Consistent estimation of the filtering and marginal smoothing distributions in nonparametric hidden Markov models. IEEE Trans. Inform. Theory 63 4758-4777. · Zbl 1372.94362
[22] Dziak, J. J., Li, R., Tan, X., Shiffman, S. and Shiyko, M. P. (2015). Modeling intensive longitudinal data with mixtures of nonparametric trajectories and time-varying effects. Psychol. Methods 20 444-469.
[23] Faicel, C. (2016). Unsupervised learning of regression mixture models with unknown number of components. J. Stat. Comput. Simul. 86 2308-2334.
[24] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Monographs on Statistics and Applied Probability 66. CRC Press, London. · Zbl 0873.62037
[25] Fan, J., Zhang, C. and Zhang, J. (2001). Generalized likelihood ratio statistics and Wilks phenomenon. Ann. Statist. 29 153-193. · Zbl 1029.62042
[26] Gassiat, E. (2017). Mixtures of nonparametric components and hidden Markov models. In Handbook of Mixture Analysis (S. Fruhwirth-Schnatter, G. Celeux and C. P. Robert, eds.) 343-360. CRC Press, Boca Raton, FL. · Zbl 1428.62272
[27] Gassiat, E., Cleynen, A. and Robin, S. (2016). Inference in finite state space non parametric hidden Markov models and applications. Stat. Comput. 26 61-71. · Zbl 1342.62141
[28] Gassiat, E. and Rousseau, J. (2016). Nonparametric finite translation hidden Markov models and extensions. Bernoulli 22 193-212. · Zbl 1388.62243
[29] Gassiat, E., Rousseau, J. and Vernet, E. (2018). Efficient semiparametric estimation and model selection for multidimensional mixtures. Electron. J. Stat. 12 703-740. · Zbl 06864474
[30] Hall, P. and Zhou, X.-H. (2003). Nonparametric estimation of component distributions in a multivariate mixture. Ann. Statist. 31 201-224. · Zbl 1018.62021
[31] Härdle, W., Hall, P. and Ichimura, H. (1993). Optimal smoothing in single-index models. Ann. Statist. 21 157-178. · Zbl 0770.62049
[32] Hohmann, D. and Holzmann, H. (2013). Semiparametric location mixtures with distinct components. Statistics 47 348-362. · Zbl 1440.62108
[33] Hu, H., Wu, Y. and Yao, W. (2016). Maximum likelihood estimation of the mixture of log-concave densities. Comput. Statist. Data Anal. 101 137-147. · Zbl 1466.62105
[34] Hu, H., Yao, W. and Wu, Y. (2017). The robust EM-type algorithms for log-concave mixtures of regression models. Comput. Statist. Data Anal. 111 14-26. · Zbl 1464.62092
[35] Huang, M., Li, R. and Wang, S. (2013). Nonparametric mixture of regression models. J. Amer. Statist. Assoc. 108 929-941. · Zbl 06224977
[36] Huang, M. and Yao, W. (2012). Mixture of regression models with varying mixing proportions: A semiparametric approach. J. Amer. Statist. Assoc. 107 711-724. · Zbl 1261.62036
[37] Huang, M., Li, R., Wang, H. and Yao, W. (2014). Estimating mixture of Gaussian processes by kernel smoothing. J. Bus. Econom. Statist. 32 259-270.
[38] Huang, M., Wang, S., Wang, H. and Jin, T. (2018a). Maximum smoothed likelihood estimation for a class of semiparametric Pareto mixture densities. Stat. Interface 11 31-40. · Zbl 06938678
[39] Huang, M., Wang, S., Yao, W. and Chen, Y. (2018b). Statistical inference and applications of mixture of varying coefficient models. Scand. J. Stat. 45 618-643. · Zbl 1402.62140
[40] Hunter, D. R., Wang, S. and Hettmansperger, T. P. (2007). Inference for mixtures of symmetric distributions. Ann. Statist. 35 224-251. · Zbl 1114.62035
[41] Hunter, D. R. and Young, D. S. (2012). Semiparametric mixtures of regressions. J. Nonparametr. Stat. 24 19-38. · Zbl 1241.62055
[42] Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. J. Econometrics 58 71-120. · Zbl 0816.62079
[43] Jacobs, R. A., Peng, F. and Tanner, M. A. (1997). A Bayesian approach to model selection in hierarchical mixtures-of-experts architectures. Neural Netw. 10 231-241.
[44] Lemdani, M. and Pons, O. (1999). Likelihood ratio tests in contamination models. Bernoulli 5 705-719. · Zbl 0929.62015
[45] Leroux, B. G. (1992). Consistent estimation of a mixing distribution. Ann. Statist. 20 1350-1360. · Zbl 0763.62015
[46] Levine, M., Hunter, D. R. and Chauveau, D. (2011). Maximum smoothed likelihood for multivariate mixtures. Biometrika 98 403-416. · Zbl 1215.62055
[47] Lindsay, B. G. (1983). The geometry of mixture likelihoods: A general theory. Ann. Statist. 11 86-94. · Zbl 0512.62005
[48] Ma, Y. and Yao, W. (2015). Flexible estimation of a semiparametric two-component mixture model with one parametric component. Electron. J. Stat. 9 444-474. · Zbl 1312.62044
[49] Ma, Y., Wang, S., Xu, L. and Yao, W. (2018). Semiparametric mixture regression with unspecified error distributions. Available at arXiv:1811.01117.
[50] Maiboroda, R. and Sugakova, O. (2011). Generalized estimating equations for symmetric distributions observed with admixture. Comm. Statist. Theory Methods 40 96-116. · Zbl 1208.62059
[51] McLachlan, G. J., Bean, R. W. and Jones, L. B.-T. (2006). A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays. Bioinformatics 22 1608-1615.
[52] McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley Interscience, New York. · Zbl 0963.62061
[53] Montuelle, L. and Le Pennec, E. (2014). Mixture of Gaussian regressions model with logistic weights, a penalized maximum likelihood approach. Electron. J. Stat. 8 1661-1695. · Zbl 1297.62091
[54] Nguyen, V. H. and Matias, C. (2014). On efficient estimators of the proportion of true null hypotheses in a multiple testing setup. Scand. J. Stat. 41 1167-1194. · Zbl 1305.62272
[55] Patra, R. K. and Sen, B. (2016). Estimation of a two-component mixture model with applications to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 869-893. · Zbl 1414.62111
[56] Pommeret, D. and Vandekerkhove, P. (2018). Semiparametric false discovery rate model Gaussianity test. Available at https://hal.archives-ouvertes.fr/hal-01868272.
[57] Roger, G. and Pol, S. (1991). Stochastic Finite Elements: A Special Approach. Springer, Berlin. · Zbl 0722.73080
[58] Rufibach, K. (2007). Computing maximum likelihood estimators of a log-concave density function. J. Stat. Comput. Simul. 77 561-574. · Zbl 1146.62027
[59] Song, S., Nicolae, D. L. and Song, J. (2010). Estimating the mixing proportion in a semiparametric mixture model. Comput. Statist. Data Anal. 54 2276-2283. · Zbl 1284.62395
[60] Tan, X., Shiyko, M. P., Li, R., Li, Y. and Dierker, L. (2012). A time-varying effect model for intensive longitudinal data. Psychol. Methods 17 61-77.
[61] Vandekerkhove, P. (2013). Estimation of a semiparametric mixture of regressions model. J. Nonparametr. Stat. 25 181-208. · Zbl 1297.62076
[62] von Neumann, J. (1931). Die Eindeutigkeit der Schrödingerschen Operatoren. Math. Ann. 104 570-578. · Zbl 0001.24703
[63] Walther, G. (2002). Detecting the presence of mixing with multiscale maximum likelihood. J. Amer. Statist. Assoc. 97 508-513. · Zbl 1073.62533
[64] Wang, Y. (2010). Maximum likelihood computation for fitting semiparametric mixture models. Stat. Comput. 20 75-86.
[65] Wang, S., Yao, W. and Huang, M. (2014). A note on the identifiability of nonparametric and semiparametric mixtures of GLMs. Statist. Probab. Lett. 93 41-45. · Zbl 1400.62159
[66] Wang, S., Huang, M., Wu, X. and Yao, W. (2016). Mixture of functional linear models and its application to \(\text{CO}_2 \)-GDP functional data. Comput. Statist. Data Anal. 97 1-15. · Zbl 06918488
[67] Wu, Q. and Yao, W. (2016). Mixtures of quantile regressions. Comput. Statist. Data Anal. 93 162-176. · Zbl 06918695
[68] Wu, J., Yao, W. and Xiang, S. (2017). Computation of an efficient and robust estimator in a semiparametric mixture model. J. Stat. Comput. Simul. 87 2128-2137.
[69] Xiang, S. and Yao, W. (2017). Semiparametric mixtures of regressions with single-index for model based clustering. Available at arXiv:1708.04142.
[70] Xiang, S. and Yao, W. (2018). Semiparametric mixtures of nonparametric regressions. Ann. Inst. Statist. Math. 70 131-154. · Zbl 1385.62014
[71] Xiang, S., Yao, W. and Seo, B. (2016). Semiparametric mixture: Continuous scale mixture approach. Comput. Statist. Data Anal. 103 413-425. · Zbl 1466.62218
[72] Xiang, S., Yao, W. and Wu, J. (2014). Minimum profile Hellinger distance estimation for a semiparametric mixture model. Canad. J. Statist. 42 246-267. · Zbl 1349.62108
[73] Yao, F., Fu, Y. and Lee, T. C. M. (2011). Functional mixture regression. Biostatistics 12 341-353.
[74] Young, D. S. (2014). Mixtures of regressions with changepoints. Stat. Comput. 24 265-281. · Zbl 1325.62128
[75] Young, D. S. and Hunter, D. R. (2010). Mixtures of regressions with predictor-dependent mixing proportions. Comput. Statist. Data Anal. 54 2253-2266. · Zbl 1284.62467
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.