zbMATH — the first resource for mathematics

Generalized regression trees. (English) Zbl 0825.62610
Summary: Trees are powerful tools to describe and organize knowledge. As a method of data analysis, trees are used in many disciplines to structure a priori knowledge as well as to formulate and test hypotheses. The author outlines in this paper a general approach to tree-growing whose goal is to predict a parameter of a statistical model, i.e., to develop a general framework for constructing regression trees from data in the context of a generalized regression model (GLIM). Problems of overfit bias and suggestions for further work are discussed.

62J12 Generalized linear models (logistic models)
65C99 Probabilistic methods, stochastic differential equations
62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI
[1] Sonquist, J. A.; Morgan, J. N.: The detection of interaction effects. (1964) · Zbl 0114.10103
[2] Breiman, L.; Friedman, J. H.; Olshen, R. A.; Stone, C. J.: Classification and regression trees. (1984) · Zbl 0541.62042
[3] Ciampi, A.; Chang, C. H.; Hogg, S.; Mckinney, S.: Recursive partition: A versatile method for exploratory data analysis in biostatistics. Biostatistics, 23-50 (1987)
[4] Ciampi, A.; Hogg, S. A.; Mckinney, A.; Thiffault, J.: RECPAM: A computer program for recursive partition and amalgamation for censored survival data and other situations frequently occuring in biostatistics. I. methods and program features. Computer methods and programs in biomedicine 26, 239-256 (1988)
[5] Gueguen, A.; Nakache, J. P.: Méthode de discrimination basée sur la construction d’un arbre de décision binaire. Revue de statisque appliquée 36, 19-38 (1988)
[6] Loh, W. Y.; Vanichsetakul, N.: Tree-structured classification via generalized discriminant analysis. Jasa 83, 715-728 (1988) · Zbl 0649.62055
[7] Segal, M. R.: Regression trees for censored data. Biometrics 44, 35-48 (1988) · Zbl 0707.62224
[8] Kullback, S.; Leibler, A.: On information and sufficiency. Ann. math. Stat. 22, 79-86 (1951) · Zbl 0042.38403
[9] Matusita, K.: Decision rule, based on the distance, for the classification problem. Amer. inst. Statist. math. 8, 67-77 (1956) · Zbl 0073.14903
[10] Beran, R.: Minimum Hellinger distance estimator for parametric models. Ann. statist. 22, 79-86 (1977) · Zbl 0381.62028
[11] Simpson, D. G.: Hellinger deviance tests: efficiency breakdown points and examples. Jasa 84, 107-113 (1989)
[12] Ali, S. M.; Silvey, S. D.: A general class of coefficients of divergence of one distribution from another. J.R. statist. Soc. 28, 131-142 (1966) · Zbl 0203.19902
[13] Luong, A.; Thompson, M. E.: Minimum distance methods based on quadratic distances for transforms. Canad. J. Statist. 15, 239-251 (1987) · Zbl 0645.62037
[14] Linhart, H.; Zucchini, W.: Model selection. (1986) · Zbl 0665.62003
[15] Rao, C. R.: On the distance between population. Sankhya 9, 246-248 (1948)
[16] Krzanowski, W. J.: Distance between populations using mixed continuous and categorical variables. Biometrika 70, 235-243 (1983) · Zbl 0514.62067
[17] Gabriel, K. R.: Simultaneous test procedures – some theory of multiple comparisons. Ann. of math. Statistics 40, 224-250 (1969) · Zbl 0198.23602
[18] Mccullagh, P.; Nelder, J. A.: Generalized linear models. (1983) · Zbl 0588.62104
[19] Pregibon, D.: Score tests in GLIM with applications. Lecture notes in statistics no. 14 (1982) · Zbl 0493.62063
[20] Lawless, J. F.; Singhal, K.: Efficient screening of nonnormal regression models. Biometrics 34, 318-327 (1978)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.