## A consistent procedure for determining the number of clusters in regression clustering.(English)Zbl 1074.62042

Summary: An information-based criterion for determining the number of clusters in the problem of regression clustering is proposed. It is shown that, under a probabilistically structured population, the proposed criterion selects the true number of regression hyperplanes with probability one among all class-growing sequences of classifications, when the number of observations $$n$$ from the population increases to infinity. Results from a simulation study are also presented.

### MSC:

 62H30 Classification and discrimination; cluster analysis (statistical aspects) 62J05 Linear regression; mixed models 62F12 Asymptotic properties of parametric estimators

### Keywords:

Clustering; Multiple regression; Model selection; Consistency

Algorithm 39
Full Text:

### References:

 [1] Bai, Z.D.; Rao, C.R.; Wu, Y., Model selection with data-oriented penalty, J. statist. plann. inference, 77, 103-117, (1999) · Zbl 0926.62045 [2] Bock, H.H., 1996. Probability models and hypotheses testing in partitioning cluster analysis. In: Arabie, P., Hubert, L.J., De Soete, G. (Eds.), Clustering and Classification. World Scientific Publishing. River Edge, NJ. pp. 377-453. · Zbl 1031.62504 [3] Bock, H.H., 1999. Regression-type models for Kohonen’s self-organizing networks. In: Decker, R., Gaul, W. (Eds.), Classification and Information Processing at the Turn of the Millennium. Springer, New York-Heidelberg-Berlin. pp. 18-31. [4] DeSarbo, W.S.; Cron, W.L., A maximum likelihood methodology for clusterwise linear regression, J. classification, 5, 249-282, (1988) · Zbl 0692.62052 [5] Hannan, E.J.; Quinn, B.G., The determination of the order of an autoregression, J. roy. statist. soc. B, 41, 190-195, (1979) · Zbl 0408.62076 [6] Lou, X.; Jiang, J.; Keng, K., Clustering objects generated by linear regression models, J. amer. statist. assoc., 88, 1356-1362, (1993) · Zbl 0792.62053 [7] McClelland, R.L.; Kronmal, R.A., Regression-based variable clustering for data reduction, Statist. med., 21, 921-941, (2002) [8] Petrov, V.V., Limit theorems of probability theory, (1995), Oxford Science Publications Oxford · Zbl 0826.60001 [9] Rao, C.R.; Wu, Y., A strongly consistent procedure for model selection in a regression problem, Biometrika, 76, 369-374, (1989) · Zbl 0669.62051 [10] Shao, J., An asymptotic theory for linear model selection, Statist. sinica, 7, 221-264, (1997) · Zbl 1003.62527 [11] Algorithm 48: a fast algorithm for clusterwise linear regression, Computing, 29, 175-181, (1982) · Zbl 0485.65030
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.