×

zbMATH — the first resource for mathematics

Consistency of random survival forests. (English) Zbl 1190.62177
Summary: We prove uniform consistency of Random Survival Forests (RSF), a newly introduced forest ensemble learner for analysis of right-censored survival data. Consistency is proven under general splitting rules, bootstrapping, and random selection of variables, that is, under true implementation of the methodology. Under this setting we show that the forest ensemble survival function converges uniformly to the true population survival function. To prove this result we make one key assumption regarding the feature space: we assume that all variables are factors. Doing so ensures that the feature space has finite cardinality and enables us to exploit counting process theory and the uniform consistency of the Kaplan-Meier survival function.

MSC:
62N01 Censored data models
62G20 Asymptotic properties of nonparametric inference
62P10 Applications of statistics to biology and medical sciences; meta analysis
PDF BibTeX XML Cite
Full Text: DOI arXiv
References:
[1] Amit, Y.; Geman, D., Shape quantization and recognition with randomized trees, Neural computation, 9, 1545-1588, (1997)
[2] Andersen, P.K.; Borgan, O.; Gill, R.D.; Keiding, N., Statistical methods based on counting processes, (1993), Springer New York
[3] Biau, G.; Devroye, L.; Lugosi, G., Consistency of random forests and other classifiers, Journal of machine learning research, 9, 2039-2057, (2008)
[4] Breiman, L., Bagging predictors, Machine learning, 26, 123-140, (1996) · Zbl 0858.68080
[5] Breiman, L., Random forests, Machine learning, 45, 5-32, (2001) · Zbl 1007.68152
[6] Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., Classification and Regression Trees, Belmont, California, 1984. · Zbl 0541.62042
[7] Cortes, C.; Vapnik, V.N., Support-vector networks, Machine learning, 20, 273-297, (1995) · Zbl 0831.68098
[8] Cutler, A.; Zhao, G., Pert — perfect random tree ensembles, Computing science and statistics, 33, 490-497, (2001)
[9] Dietterich, T.G., An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization, Machine learning, 40, 139-157, (2000)
[10] Efron, B., Bootstrap methods: another look at the jackknife, The annals of statistics, 7, 1-26, (1979) · Zbl 0406.62024
[11] Fleming, T.; Harrington, D., Counting processes and survival analysis, (1991), Wiley New York · Zbl 0727.62096
[12] Freund, Y., Shapire, R.E., 1996. Experiments with a new boosting algorithm, in: Proc. of the 13th. Int. Conf. on Machine Learning. pp. 148-156.
[13] Ishwaran, H.; Blackstone, E.H.; Hansen, C.A.; Rice, T.W., A novel approach to cancer staging: application to esophageal cancer, Biostatistics, 10, 603-620, (2009)
[14] Ishwaran, H.; Kogalur, U.B., Random survival forests for R, Rnews, 7/2, 25-31, (2007)
[15] Ishwaran, H., Kogalur, U.B., 2010. RandomSurvivalForest: Random Survival Forests. R package version 3.6.1. http://cran.r-project.org. · Zbl 1190.62177
[16] Ishwaran, H.; Kogalur, U.B.; Blackstone, E.H.; Lauer, M.S., Random survival forests, The annals of applied statistics, 2, 841-860, (2008) · Zbl 1149.62331
[17] LeBlanc, M.; Crowley, J., Survival trees by goodness of split, Journal of the American statistical association, 88, 457-467, (1993) · Zbl 0773.62071
[18] Lin, Y.; Jeon, Y., Random forests and adaptive nearest neighbors, Journal of the American statistical association, 101, 578-590, (2006) · Zbl 1119.62304
[19] Lo, S.-H.; Singh, K., The product-limit estimator and the bootstrap: some asymptotic representations, Probability theory and related fields, 71, 455-465, (1985) · Zbl 0561.62032
[20] Meinshausen, N., Quantile regression forests, Journal of machine learning research, 7, 983-999, (2006) · Zbl 1222.68262
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.