×

Centre and range method for fitting a linear regression model to symbolic interval data. (English) Zbl 1452.62493

Summary: This paper introduces a new approach to fitting a linear regression model to symbolic interval data. Each example of the learning set is described by a feature vector, for which each feature value is an interval. The new method fits a linear regression model on the mid-points and ranges of the interval values assumed by the variables in the learning set. The prediction of the lower and upper bounds of the interval value of the dependent variable is accomplished from its mid-point and range, which are estimated from the fitted linear regression model applied to the mid-point and range of each interval value of the independent variables. The assessment of the proposed prediction method is based on the estimation of the average behaviour of both the root mean square error and the square of the correlation coefficient in the framework of a Monte Carlo experiment. Finally, the approaches presented in this paper are applied to a real data set and their performance is compared.

MSC:

62J05 Linear regression; mixed models
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62-08 Computational methods for problems pertaining to statistics
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Bertrand, P., Goupil, F., 2000. Descriptive statistic for symbolic data. In: Bock, H.-H., Diday, E. (Eds.), Analysis of Symbolic Data. Springer, Heidelberg, pp. 106-124. · Zbl 0978.62005
[2] Bock, H.-H., Clustering algorithms and Kohonen maps for symbolic data, J. jpn. soc. comput. statist., 15, 1-13, (2002)
[3] Bock, H.H.; Diday, E., Analysis of symbolic data, exploratory methods for extracting statistical information from complex data, (2000), Springer Heidelberg · Zbl 1039.62501
[4] Billard, L., Diday, E., 2000. Regression analysis for interval-valued data. In: Data Analysis, Classification and Related Methods, Proceedings of the Seventh Conference of the International Federation of Classification Societies (IFCS’00), Springer, Belgium, pp. 369-374. · Zbl 1026.62073
[5] Billard, L., Diday, E., 2002. Symbolic regression analysis. In: Classification, Clustering and Data Analysis, Proceedings of the Eighenth Conference of the International Federation of Classification Societies (IFCS’02), Springer, Poland, pp. 281-288. · Zbl 1185.62129
[6] Billard, L.; Diday, E., From the statistics of data to the statistics of knowledge: symbolic data analysis, J. amer. statist. assoc., 98, 462, 470-487, (2003)
[7] Cazes, P.; Chouakria, A.; Diday, E.; Schektman, S., Extension de l’analyse en composantes principales des donnes de type intervalle, Rev. statist. aplique, XLV, 3, 5-24, (1997)
[8] Chavent, M., A monothetic clustering method, Pattern recognition lett., 19, 989-996, (1998) · Zbl 0915.68148
[9] Chavent, M, Lechevallier, Y., 2002. Dynamical clustering algorithm of interval data: optimization of an adequacy criterion based on Hausdorff distance. In: Sokolowski, A., Bock, H.-H. (Eds.), Classification, Clustering and Data Analysis. Springer, Heidelberg, pp. 53-59. · Zbl 1032.62058
[10] De Carvalho, F.A.T., Histograms in symbolic data analysis, Ann. oper. res., 55, 229-322, (1995) · Zbl 0844.68111
[11] De Carvalho, F.A.; Souza, R.M.C.R.; Chavent, M.; Lechevallier, Y., Adaptive Hausdorff distances and dynamic clustering of symbolic data, Pattern recognition lett., 27, 3, 167-179, (2006)
[12] Draper, N.R.; Smith, H., Applied regression analysis, (1981), Wiley New York · Zbl 0548.62046
[13] Gowda, K.C.; Diday, E., Symbolic clustering using a new dissimilarity measure, Pattern recognition, 24, 6, 567-578, (1991)
[14] Gowda, K.C.; Diday, E., Symbolic clustering using a new similarity measure, IEEE trans. systems man cybernet., 22, 368-378, (1992)
[15] Guru, D.S.; Kiranagi, B.B.; Nagabhushan, P., Multivalued type proximity measure and concept of mutual similarity value useful for clustering symbolic patterns, Pattern recognition lett., 25, 1203-1213, (2004)
[16] Ichino, M.; Yaguchi, H., Generalized Minkowski metrics for mixed feature type data analysis, IEEE trans. systems man cybernet., 24, 4, 698-708, (1994) · Zbl 1371.68235
[17] Ichino, M., Yaguchi, H., Diday, E., 1996. A fuzzy symbolic pattern classifier. In: Diday, E. et al. (Eds.), Ordinal and Symbolic Data Analysis. Springer, Berlin, pp. 92-102. · Zbl 0896.68124
[18] Lauro, N.C.; Palumbo, F., Principal component analysis of interval data: a symbolic data analysis approach, Comput. statist., 15, 1, 73-87, (2000) · Zbl 0953.62058
[19] Lauro, N.C., Verde, R., Palumbo, F., 2000. Factorial discriminant analysis on symbolic objects. In: Bock, H.-H., Diday, E. (Eds.), Analysis of Symbolic Data. Springer, Heidelberg, pp. 212-233. · Zbl 0977.62070
[20] Montgomery, D.C.; Peck, E.A., Introduction to linear regression analysis, (1982), Wiley New York · Zbl 0587.62134
[21] Palumbo, F.; Verde, R., Non-symmetrical factorial discriminant analysis for symbolic objects, Appl. stochastic models business indust., 15, 4, 419-427, (2000) · Zbl 0960.62062
[22] Périnel, E., Lechevallier, Y., 2000. Symbolic Discriminant Rules. In: Bock, H.-H., Diday, E. (Eds.), Analysis of Symbolic Data. Springer, Heidelberg, pp. 244-265. · Zbl 0976.62061
[23] Rasson, J.P., Lissoir, S., 2000. Symbolic kernel discriminant analysis. In: Bock, H.-H., Diday, E. (Eds.), Anal. Symbolic Data. Springer, Heidelberg, pp. 240-244. · Zbl 0977.62072
[24] Scheffé, H., The analysis of variance, (1959), Wiley New York · Zbl 0086.34603
[25] Souza, R.M.C.R.; De Carvalho, F.A.T., Clustering of interval data based on city-block distances, Pattern recognition lett., 25, 3, 353-365, (2004)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.