×

A review of survival trees. (English) Zbl 1274.62648

Summary: This paper presents a non-technical account of the developments in tree-based methods for the analysis of survival data with censoring. This review describes the initial developments, which mainly extended the existing basic tree methodologies to censored data as well as to more recent work. We also cover more complex models, more specialized methods, and more specific problems such as multivariate data, the use of time-varying covariates, discrete-scale survival data, and ensemble methods applied to survival trees. A data example is used to illustrate some methods that are implemented in R.

MSC:

62N01 Censored data models
62-02 Research exposition (monographs, survey articles) pertaining to statistics
PDF BibTeX XML Cite
Full Text: DOI Euclid

References:

[1] Ahn, H. and Loh, W.-Y. (1994). Tree-Structured Proportional Hazards Regression Modeling. Biometrics 50 , 471-485. · Zbl 0825.62772
[2] Bacchetti, P. and Segal, M. (1995). Survival Trees with Time-Dependent Covariates: Application to Estimating Changes in the Incubation Period of AIDS. Lifetime Data Analysis 1 , 35-47. · Zbl 0825.62904
[3] Benner, A. (2002). Application of “Aggregated Classifiers” in Survival Time Studies. COMPSTAT 2002 - Proceedings in Computational Statistics: 15th Symposium Held in Berlin, Germany, 2002
[4] Bou-Hamad, I., Larocque, D., Ben-Ameur, H., Mâsse, L., Vitaro, F. and Tremblay, R. (2009). Discrete-Time Survival Trees. Canadian Journal of Statistics 37 , 17-32. · Zbl 1170.62074
[5] Bou-Hamad, I., Larocque, D. and Ben-Ameur, H. (2011). Discrete-Time Survival Trees and Forests with Time-Varying Covariates: Application to Bankruptcy Data. To appear in Statistical Modeling . · Zbl 1274.62648
[6] Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees. Wadsworth International Group, Belmont, California. · Zbl 0541.62042
[7] Breiman, L. (1996). Bagging Predictors. Machine Learning 24 , 123-140. · Zbl 0858.68080
[8] Breiman, L. (2001). Random Forests. Machine Learning 45 , 5-32. · Zbl 1007.68152
[9] Cho, H. and Hong, S-M. (2008). Median Regression Tree for Analysis of Censored Survival Data. Systems, Man and Cybernetics, Part A, IEEE Transactions on 38 , 715-726.
[10] Ciampi, A., Bush, R. S., Gospodarowicz, M. and Till, J. E. (1981). An Approach to Classifying Prognostic Factors Related to Survival Experience for Non-Hodgkin’s Lymphoma Patients: Based on a Series of 982 Patients: 1967-1975. Cancer 47 , 621-627.
[11] Ciampi, A., Thiffault, J., Nakache, J.-P. and Asselain, B. (1986). Stratification by Stepwise Regression, Correspondance Analysis and Recursive Partition: A Comparison of Three Methods of Analysis for Survival Data with Covariates. Computational Statistics & Data Analysis 4 , 185-204. · Zbl 0649.62106
[12] Ciampi, A., Chang, C. H., Hogg, S. and McKinney, S. (1987). Recursive Partition: A Versatile Method for Exploratory Data Analysis in Biostatistics. Biostatistics 23-50
[13] Ciampi, A., Hogg, S. A., McKinney, S. and Thiffault, J. (1988). RECPAM: A Computer Program for Recursive Partition and Amalgamation for Censored Survival Data and Other Situations Frequently Occurring in Biostatistics. I. Methods and Program Features. Computer Methods and Programs in Biomedicine 26 , 239-256.
[14] Ciampi, A., Hogg, S. A., McKinney, S. and Thiffault, J. (1989). RECPAM: A Computer Program for Recursive Partition and Amalgamation for Censored Survival Data and Other Situations Frequently Occurring in Biostatistics. II. Applications to Data on Small Cell Carcinoma of the Lung (SCCL). Computer Methods and Programs in Biomedicine 30 , 283-296.
[15] Ciampi, A., Negassa, A. and Lou, Z. (1995). Tree-Structured Prediction for Censored Survival Data and the Cox Model. Journal of Clinical Epidemiology 48 , 675-689.
[16] Ciampi, A., Thiffault, J. and Sagman, U. (1989). RECPAM: A Computer Program for Recursive Partition and Amalgamation for Censored Survival Data and Other Situations Frequently Occurring in Biostatistics. II. Applications to Data on Small Cell Carcinoma of the Lung (SCCL). Computer Methods and Programs in Biomedicine 30 , 283-296.
[17] Dannegger, F. (2000). Tree Stability Diagnostics and Some Remedies for Instability. Statistics in Medicine 19 , 475-491.
[18] Davis, R. B. and Anderson, J. R. (1989). Exponential Survival Trees. Statistics in Medicine 8 , 947-961.
[19] Ding, Y. and Simonoff, J. S. (2010). An Investigation of Missing Data Methods for Classification Trees Applied to Binary Response Data. Journal of Machine Learning Research , 11 , 131-170. · Zbl 1242.62052
[20] Eckel, K. T., Pfahlberg, A., Gefeller, O. and Hothorn, T. (2008). Flexible Modeling of Malignant Melanoma Survival. Methods of Information in Medicine 47 , 47-55.
[21] Fan, J., Nunn, M. E. and Su, X. (2009). Multivariate Exponential Survival Trees and Their Application to Tooth Prognosis. Computational Statistics and Data Analysis 53 , 1110-1121. · Zbl 1452.62805
[22] Fan, J., Su, X.-G., Levine, R., Nunn, M. and Leblanc, M. (2006). Trees for Censored Survival Data by Goodness of Split, with Application to Tooth Prognosis. Journal of American Statistical Association 101 , 959-967. · Zbl 1120.62328
[23] Fleming, T. R. and Harrington, D. P. (1991). Counting Processes and Survival Analysis. Wiley, New Jersey. · Zbl 0727.62096
[24] Gao, F., Manatunga, A. K. and Chen, S. (2004). Identification of Prognostic Factors with Multivariate Survival Data. Computational Statistics & Data Analysis 45 , 813-824. · Zbl 1429.62525
[25] Gao, F., Manatunga, A. K. and Chen, S. (2006). Developing Multivariate Survival Trees with a Proportonal Hazards Structure. Journal of Data Science 4 , 343-356.
[26] Gordon, L. and Olshen, R. A. (1985). Tree-structured Survival Analysis. Cancer Treatment Reports 69 , 1065-1069.
[27] Graf, E., Schmoor, C., Sauerbrei, W. and Schumacher, M. (1999). Assessment and Comparisons of Prognostic Classification Schemes for Survival Data. Statistics in Medicine 18 , 2529-2545.
[28] Hammer, P. L. and Bonates, T. O. (2006). Logical Analysis of Data-An Overview: From Combinatorial Optimization to Medical Applications. Annals of Operations Research 148 , 203-225. · Zbl 1104.92034
[29] Harrell, F., Califf, R., Pryor, D., Lee, K. and Rosati, R. (1982). Evaluating the Yield of Medical Tests. Journal of the American Medical Association 247 , 2543-2546.
[30] Hothorn, T., Lausen, B., Benner, A. and Radespiel-Tröger, M. (2004). Bagging Survival Trees. Statistics in Medicine 23 , 77-91.
[31] Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A. M. and van der Laan, M. J. (2006). Survival Ensembles. Biostatistics 7 , 355-373. · Zbl 1170.62385
[32] Huang, X., Chen, S. and Soong, S. (1998). Piecewise Exponential Survival Trees with Time-Dependent Covariates. Biometrics , 54 , 1420-1433. · Zbl 1058.62558
[33] Ishwaran, H., Blackstone, E. H., Pothier, C. E. and Lauer, M. S. (2004). Relative Risk Forests for Exercise Heart Rate Recovery as a Predictor of Mortality. Journal of the American Statistical Association 99 , 591-600. · Zbl 1117.62362
[34] Ishwaran, H., Kogalur, U. B., Blackstone, E. H. and Lauer, M. S. (2008). Random Survival Forests. Annals of Applied Statistics 2 , 841-860. · Zbl 1149.62331
[35] Ishwaran, H. and Kogalur, U. B. (2010a). Consistency of Random Survival Forests. Statistics and Probability Letters 80 , 1056-1064. · Zbl 1190.62177
[36] Ishwaran, H. and Kogalur, U. B. (2010b). Random Survival Forests, R package version 3.6.3. · Zbl 1190.62177
[37] Ishwaran, H., Kogalur, U. B., Gorodeski, E. Z., Minn, A. J. and Lauer, M. S. (2010). High Dimensional Variable Selection for Survival Data. Journal of the American Statistical Association 105 , 205-217. · Zbl 1397.62220
[38] Jin, H., Lu, Y., Stone, K. and Black, D. M. (2004). Alternative Tree-Structured Survival Analysis Based on Variance of Survival Time. Medical Decision Making 24 , 670-680.
[39] Keles, S. and Segal, M. R. (2002). Residual-Based Tree-Structured Survival Analysis. Statistics in Medicine 21 , 313-326.
[40] Kre\ogonek towska, M. (2004). Dipolar Regression Trees in Survival Analysis. Biocybernetics and Biomedical Engineering 24 , 25-33.
[41] Kre\ogonek towska, M. (2006). Random Forests of Dipolar Trees for Survival Prediction. Artificial Intelligence and Soft Computing - ICAISC 2006, Proceedings. Lecture Notes In Computer Science 4029 , 909-918.
[42] Kre\ogonek towska, M. (2010). The influence of Censoring for the Performance of Survival Tree Ensemble. Artificial Intelligence and Soft Computing, Pt II - ICAISC 2010, Proceedings. Lecture Notes in Artificial Intelligence 6114 , 524-531.
[43] Kronek, L. P., and Reddy, A. (2008). Logical Analysis of Survival Data: Prognostic Survival Models by Detecting High-Degree Interactions in Right-Censored Data. Bioinformatics 24 , 248-253. · Zbl 1149.62331
[44] Lausen, B., Hothorn, T., Bretz, F. and Schumacher, M. (2004). Assessment of Optimal Selected Prognostic Factors. Biometrical Journal 46 , 364-374.
[45] LeBlanc, M. and Crowley, J. (1992). Relative Risk Trees for Censored Survival Data. Biometrics 48 , 411-425.
[46] LeBlanc, M. and Crowley, J. (1993). Survival Trees by Goodness of Split. Journal of the American Statistical Association 88 , 457-467. · Zbl 0773.62071
[47] LeBlanc, M. and Crowley, J. (1995). A Review of Tree-Based Prognostic Models. Journal of Cancer Treatment and Research 75 , 113-124.
[48] Loh, W-Y. (1991). Survival Modeling Through Recursive Stratification. Computational Statistics and Data Analysis 12 , 295-313. · Zbl 0825.62855
[49] Marubini, E., Morabito, A. and Valsecchi, M. G. (1983). Prognostic Factors and Risk Groups: Some Results Given by Using an Algorithm Suitable for Censored Survival Data. Statistics in Medicine 2 , 295-303.
[50] Molinaro, A. M., Dudoit, S. and van der Laan, M. J. (2004). Tree-based Multivariate Regression and Density Estimation with Right-censored Data. Journal of Multivariate Analysis 90 , 154-177. · Zbl 1048.62046
[51] Morgan, J. and Sonquist, J. (1963). Problems in the Analysis of Survey Data and a Proposal. Journal of the American Statistical Association 58 , 415-434. · Zbl 0114.10103
[52] Negassa, A., Ciampi, A., Abrahamowicz, M., Shapiro, S. and Boivin, J.-F. (2000). Tree-Structured Prognostic Classification for Censored Survival Data: Validation of Computationally Inexpensive Model Selection Criteria. Journal of Statistical Computation and Simulation 67 , 289-318. · Zbl 0961.62086
[53] Negassa, A., Ciampi, A., Abrahamowicz, M., Shapiro, S. and Boivin, J.-F. (2005) Tree-Structured Subgroup Analysis for Censored Survival Data: Validation of Computationally Inexpensive Model Selection Criteria. Statistics and Computing 15 , 231-239. · Zbl 0961.62086
[54] Peters, A. and Hothorn, T. (2009). ipred: Improved Predictors. R package version 0.8-8. .
[55] R Development Core Team (2010). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL .
[56] Radespiel-Tröger, M., Rabenstein, T., Schneider, H. T. and Lausen, B. (2003). Comparison of Tree-based Methods for Prognostic Stratification of Survival Data. Artificial Intelligence in Medicine 28 , 323-341.
[57] Radespiel-Tröger, M., Gefeller, O., Rabenstein, T. and Hothorn, T. (2006). Association Between Split Selection Instability and Predictive Error in Survival Trees. Methods of Information in Medicine 45 , 548-556.
[58] Ridgeway, G. (1999). The State of Boosting. Computing Science and Statistics . 31 , 172-181.
[59] Rokach, L. (2008). Taxonomy for Characterizing Ensemble Methods in Classification Tasks: A Review and Annotated Bibliography. Computational Statistics and Data Analysis 53 , 4046-4072. · Zbl 1453.62185
[60] Schlittgen, R. (1999). Regression Trees for Survival Data - an Approach to Select Discontinuous Split Points by Rank Statistics. Biometrical Journal 41 , 943-954. · Zbl 1109.62356
[61] Segal, M. R. (1988). Regression Trees for Censored Data. Biometrics 44 , 35-48. · Zbl 0707.62224
[62] Segal, M. R. (1992). Tree-Structured Methods for Longitudinal Data. Journal of the American Statistical Association 87 , 407-418.
[63] Siroky, D.S. (2009). Navigating Random Forests and Related Advances in Algorithmic Modeling. Statistics Surveys 3 , 147-163. · Zbl 1190.62100
[64] Su, X. and Fan, J. (2004). Multivariate Survival Trees: A Maximum Likelihood Approach Based on Frailty Models. Biometrics 60 , 93-99. · Zbl 1130.62386
[65] Su, X. and Tsai, C.-L. (2005). Tree-augmented Cox Proportional Hazards Models. Biostatistics 6 , 486-499. · Zbl 1071.62111
[66] Therneau, T., Grambsch, P. and Fleming, T. (1990). Martingale-Based Residuals for Survival Models. Biometrika 77 , 147-160. · Zbl 0692.62082
[67] Therneau, T. M. and Atkinson, B. (2010). R port by Brian Ripley. rpart: Recursive Partitioning. R package version 3.1-46. http://CRAN.R- project.org/package=rpart.
[68] Tsai, C., Chen, D.-T., Chen, J., Balch, C. M., Thompson, J. and Soong, S.-J. (2007). An Integrated Tree-Based Classification Approach to Prognostic Grouping with Application to Localized Melanoma Patients. Journal of Biopharmaceutical Statistics 17 , 445-460.
[69] Verikas, A., Gelzinis, A. and Bacauskiene, M. (2011). Mining Data With Random Forests: A Survey and Results of New Tests. Pattern Recognition 44 , 330-349.
[70] Wallace, M. L., Anderson, S. J. and Mazumdar, S. (2010). A Stochastic Multiple Imputation Algorithm for Missing Covariate Data in Tree-Structured Survival Analysis. Statistics in Medicine 29 , 3004-3016.
[71] Xu, R. and Adak, S. (2001). Survival Analysis with Time-Varying Relative Risks: A Tree-Based Approach. Methods of information in medicine 40 , 141-147.
[72] Xu, R. and Adak, S. (2002). Survival Analysis with Time-Varying Regression Effects Using a Tree-Based Approach. Biometrics 58 , 305-315. · Zbl 1209.62341
[73] Yin, Y. and Anderson, J. (2002). Nonparametric Tree-Structured Modeling for Interval-Censored Survival Data. Joint Statistical Meeting, August 2002. 6 pages.
[74] Zhang, H.P. (1995). Splitting Criteria in Survival Trees. In Statistical Modelling: Proceedings of the 10th International Workshop on Statistical Modeling , 305-314, Springer.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.