A unified framework for Fitting Bayesian semiparametric models to arbitrarily censored survival data, including spatially referenced data. (English) Zbl 1398.62266

Summary: A comprehensive, unified approach to modeling arbitrarily censored spatial survival data is presented for the three most commonly used semiparametric models: proportional hazards, proportional odds, and accelerated failure time. Unlike many other approaches, all manner of censored survival times are simultaneously accommodated including uncensored, interval censored, current-status, left and right censored, and mixtures of these. Left-truncated data are also accommodated leading to models for time-dependent covariates. Both georeferenced (location exactly observed) and areally observed (location known up to a geographic unit such as a county) spatial locations are handled; formal variable selection makes model selection especially easy. Model fit is assessed with conditional Cox–Snell residual plots, and model choice is carried out via log pseudo marginal likelihood (LPML) and deviance information criterion (DIC). Baseline survival is modeled with a novel transformed Bernstein polynomial prior. All models are fit via a new function which calls efficient compiled C++ in the R package. The methodology is broadly illustrated with simulations and real data applications. An important finding is that proportional odds and accelerated failure time models often fit significantly better than the commonly used proportional hazards model. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.


62N01 Censored data models
62-04 Software, source code, etc. for problems pertaining to statistics
62F15 Bayesian inference
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62M30 Inference from spatial processes
62N05 Reliability and life testing
Full Text: DOI arXiv


[1] Antoniak, C. E., Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems, The Annals of Statistics, 2, 1152-1174, (1974) · Zbl 0335.60034
[2] Arbia, G.; Espa, G.; Giuliani, D., A spatial analysis of health and pharmaceutical firm survival, Journal of Applied Statistics, 44, 1560-1575, (2017)
[3] Assunção, R.; Krainski, E., Neighborhood dependence in Bayesian spatial models, Biometrical Journal, 51, 851-869, (2009)
[4] Baltazar-Aban, I.; Pena, E. A., Properties of hazard-based residuals and implications in model diagnostics, Journal of the American Statistical Association, 90, 185-197, (1995) · Zbl 0820.62085
[5] Banerjee, S.; Carlin, B. P.; Gelfand, A. E., Hierarchical Modeling and Analysis for Spatial Data, (2014), Chapman and Hall/CRC Press, Boca Raton, FL
[6] Banerjee, S.; Gelfand, A. E.; Finley, A. O.; Sang, H., Gaussian predictive process models for large spatial data sets,, Journal of the Royal Statistical Society, 70, 825-848, (2008) · Zbl 05563371
[7] Banerjee, S.; Wall, M. M.; Carlin, B. P., Frailty modeling for spatially correlated survival data, with application to infant mortality in minnesota, Biostatistics, 4, 123-142, (2003) · Zbl 1142.62420
[8] Belitz, C.; Brezger, A.; Klein, N.; Kneib, T.; Lang, S.; Umlauf, N., BayesX - Software for Bayesian Inference in Structured Additive Regression Models, Version 3.0, (2015)
[9] Besag, J., Spatial interaction and the statistical analysis of lattice systems,, Journal of the Royal Statistical Society, 36, 192-236, (1974) · Zbl 0327.60067
[10] Cai, B.; Lin, X.; Wang, L., Bayesian proportional hazards model for current status data with monotone splines, Computational Statistics & Data Analysis, 55, 2644-2651, (2011) · Zbl 1464.62034
[11] Chen, Y.; Hanson, T.; Zhang, J., Accelerated hazards model based on parametric families generalized with Bernstein polynomials, Biometrics, 70, 192-201, (2014) · Zbl 1419.62324
[12] Cox, D. R., Regression models and life-tables (with discussion),, Journal of the Royal Statistical Society, 34, 187-220, (1972) · Zbl 0243.62041
[13] Cox, D. R.; Snell, E. J., A general definition of residuals,, Journal of the Royal Statistical Society, 30, 248-275, (1968) · Zbl 0164.48903
[14] Cressie, N.; Johannesson, G., Fixed rank Kriging for very large spatial data sets,, Journal of the Royal Statistical Society, 70, 209-226, (2008) · Zbl 05563351
[15] Darmofal, D., Bayesian spatial survival models for political event processes, American Journal of Political Science, 53, 241-257, (2009)
[16] Diva, U.; Dey, D. K.; Banerjee, S., Parametric models for spatially correlated survival data for individuals with multiple cancers, Statistics in Medicine, 27, 2127-2144, (2008)
[17] Ferguson, T. S., A Bayesian analysis of some nonparametric problems, The Annals of Statistics, 1, 209-230, (1973) · Zbl 0255.62037
[18] Flegal, J. M.; Hughes, J.; Vats, D., mcmcse: Monte Carlo Standard Errors for MCMC, R Package Version 1.2-1, (2016), Riverside, CA and Minneapolis, MN
[19] Geisser, S.; Eddy, W. F., A predictive approach to model selection, Journal of the American Statistical Association, 74, 153-160, (1979) · Zbl 0401.62036
[20] Ghosal, S., Convergence rates for density estimation with Bernstein polynomials, The Annals of Statistics, 29, 1264-1280, (2001) · Zbl 1043.62024
[21] Gilks, W. R.; Wild, P., Adaptive rejection sampling for Gibbs sampling, Applied Statistics, 41, 337-348, (1992) · Zbl 0825.62407
[22] Haario, H.; Saksman, E.; Tamminen, J., An adaptive metropolis algorithm, Bernoulli, 7, 223-242, (2001) · Zbl 0989.65004
[23] Hanson, T.; Johnson, W.; Laud, P., Semiparametric inference for survival models with step process covariates, Canadian Journal of Statistics, 37, 60-79, (2009) · Zbl 1170.62078
[24] Hanson, T. E.; Jara, A.; Zhao, L., A Bayesian semiparametric temporally-stratified proportional hazards model with spatial frailties, Bayesian Analysis, 7, 147-188, (2012) · Zbl 1330.62368
[25] Henderson, R.; Shimakura, S.; Gorst, D., Modeling spatial variation in leukemia survival data, Journal of the American Statistical Association, 97, 965-972, (2002) · Zbl 1048.62102
[26] Hennerfeind, A.; Brezger, A.; Fahrmeir, L., Geoadditive survival models, Journal of the American Statistical Association, 101, 1065-1075, (2006) · Zbl 1120.62331
[27] Higdon, D.; Anderson, C. W.; Barnett, V.; Chatwin, P. C.; El-Shaarawi, A. H., Quantitative Methods for Current Environmental Issues, Space and space-time modeling using process convolutions, 37-56, (2002), Springer, New York
[28] Jerrett, M.; Burnett, R. T.; Beckerman, B. S.; Turner, M. C.; Krewski, D.; Thurston, G.; Martin, R. V.; van Donkelaar, A.; Hughes, E.; Shi, Y.; Gapstur, S. M.; Thun, M. J.; Pope, C. A., Spatial analysis of air pollution and mortality in California, American Journal of Respiratory and Critical Care Medicine, 188, 593-599, (2013)
[29] Kalbfleisch, J., Non-parametric Bayesian analysis of survival time data,, Journal of the Royal Statistical Society, 40, 214-221, (1978) · Zbl 0387.62030
[30] Kammann, E. E.; Wand, M. P., Geoadditive models, Applied Statistics, 52, 1-18, (2003) · Zbl 1111.62346
[31] Kaufman, C. G.; Schervish, M. J.; Nychka, D. W., Covariance tapering for likelihood-based estimation in large spatial data sets, Journal of the American Statistical Association, 103, 1545-1555, (2008) · Zbl 1286.62072
[32] Kneib, T., Mixed model-based inference in geoadditive hazard regression for interval-censored survival times, Computational Statistics & Data Analysis, 51, 777-792, (2006) · Zbl 1157.62506
[33] Kneib, T.; Fahrmeir, L., A mixed model approach for geoadditive hazard regression, Scandinavian Journal of Statistics, 34, 207-228, (2007) · Zbl 1142.62073
[34] Komarek, A., Accelerated failure time models for multivariate interval-censored data with flexible distributional assumptions, (2006)
[35] Komárek, A.; Lesaffre, E., Bayesian accelerated failure time model with multivariate doubly-interval-censored data and flexible distributional assumptions, Journal of the American Statistical Association, 103, 523-533, (2008) · Zbl 1469.62373
[36] The regression analysis of correlated interval-censored data illustration using accelerated failure time models with flexible distributional assumptions, Statistical Modelling, 9, 299-319, (2009)
[37] Lavine, M., Some aspects of polya tree distributions for statistical modelling, The Annals of Statistics, 20, 1222-1235, (1992) · Zbl 0765.62005
[38] Lavine, M. L.; Hodges, J. S., On rigorous specification of ICAR models, The American Statistician, 66, 42-49, (2012)
[39] Li, J.; Hong, Y.; Thapa, R.; Burkhart, H. E., Survival analysis of loblolly pine trees with spatially correlated random effects, Journal of the American Statistical Association, 110, 486-502, (2015)
[40] Li, L.; Hanson, T.; Zhang, J., Spatial extended hazard model with application to prostate cancer survival, Biometrics, 71, 313-322, (2015) · Zbl 1390.62283
[41] Li, Y.; Lin, X., Semiparametric normal transformation models for spatially correlated survival data, Journal of the American Statistical Association, 101, 591-603, (2006) · Zbl 1119.62376
[42] Li, Y.; Ryan, L., Modeling spatial survival data using semiparametric frailty models, Biometrics, 58, 287-297, (2002) · Zbl 1209.62222
[43] Lin, X.; Cai, B.; Wang, L.; Zhang, Z., A Bayesian proportional hazards model for general interval-censored data, Lifetime Data Analysis, 21, 470-490, (2015) · Zbl 1322.62133
[44] Lin, X.; Wang, L., Bayesian proportional odds models for analyzing current status data: univariate, clustered, and multivariate, Communications in Statistics-Simulation and Computation, 40, 1171-1181, (2011) · Zbl 1227.62015
[45] Liu, Y.; Sun, D.; He, C. Z., A hierarchical conditional autoregressive model for colorectal cancer survival data, Wiley Interdisciplinary Reviews: Computational Statistics, 6, 37-44, (2014)
[46] Martins, R.; Silva, G. L.; Andreozzi, V., Bayesian joint modeling of longitudinal and spatial survival AIDS data, Statistics in Medicine, 35, 3368-3384, (2016)
[47] Martins, T. G.; Simpson, D.; Lindgren, F.; Rue, H., Bayesian computing with INLA: new features, Computational Statistics & Data Analysis, 67, 68-83, (2013) · Zbl 1471.62135
[48] Morin, A. A., A spatial analysis of forest fire survival and a marked cluster process for simulating fire load, (2014)
[49] Müller, P.; Quintana, F.; Jara, A.; Hanson, T., Bayesian Nonparametric Data Analysis, (2015), Springer-Verlag, New York · Zbl 1333.62003
[50] Murray, R. P.; Anthonisen, N. R.; Connett, J. E.; Wise, R. A.; Lindgren, P. G.; Greene, P. G.; Nides, M. A., Effects of multiple attempts to quit smoking and relapses to smoking on pulmonary function, Journal of Clinical Epidemiology, 51, 1317-1326, (1998)
[51] Ojiambo, P.; Kang, E., Modeling spatial frailties in survival analysis of cucurbit downy mildew epidemics, Phytopathology, 103, 216-227, (2013)
[52] O’Quigley, J.; Xu, R.; Ermitage, P.; Colton, T., Encyclopedia of Biostatistics, Goodness of fit in survival analysis, 1-14, (2005), John Wiley & Sons, Ltd., New York
[53] Paciorek, C., Technical vignette 5: understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models, (2009)
[54] Pan, C.; Cai, B.; Wang, L.; Lin, X., Bayesian semiparametric model for spatially correlated interval-censored survival data, Computational Statistics & Data Analysis, 74, 198-209, (2014) · Zbl 06983937
[55] ICBayes: Bayesian Semiparametric Models for Interval-Censored Data, R Package Version 1.0, (2015)
[56] Petrone, S., Random Bernstein polynomials, Scandinavian Journal of Statistics, 26, 373-393, (1999) · Zbl 0939.62046
[57] Petrone, S.; Wasserman, L., Consistency of Bernstein polynomial posteriors,, Journal of the Royal Statistical Society, 64, 79-100, (2002) · Zbl 1015.62033
[58] Plummer, M.; Best, N.; Cowles, K.; Vines, K., CODA: convergence diagnosis and output analysis for MCMC, Journal of the American Statistical Association, 6, 7-11, (2006)
[59] Prentice, R. L.; Kalbfleisch, J. D., Hazard rate models with covariates, Biometrics, 35, 25-39, (1979) · Zbl 0414.62052
[60] Sang, H.; Huang, J. Z., A full scale approximation of covariance functions for large spatial data sets,, Journal of the Royal Statistical Society, 74, 111-132, (2012) · Zbl 1411.62274
[61] Sargent, D. J.; Hodges, J. S.; Carlin, B. P., Structured Markov chain Monte Carlo, Journal of Computational and Graphical Statistics, 9, 217-234, (2000)
[62] Schnell, P.; Bandyopadhyay, D.; Reich, B. J.; Nunn, M., A marginal cure rate proportional hazards model for spatial survival data,, Journal of the Royal Statistical Society, 64, 673-691, (2015)
[63] Spiegelhalter, D. J.; Best, N. G.; Carlin, B. P.; Van Der Linde, A., Bayesian measures of model complexity and fit,, Journal of the Royal Statistical Society, 64, 583-639, (2002) · Zbl 1067.62010
[64] Sun, J., The Statistical Analysis of Interval-censored Failure Time Data, (2006), Springer-Verlag, New York · Zbl 1127.62090
[65] Taylor, B. M., Spatial modelling of emergency service response times,, Journal of the Royal Statistical Society, 180, 433-453, (2017)
[66] Turnbull, B. W., Nonparametric estimation of a survivorship function with doubly censored data, Journal of the American Statistical Association, 69, 169-173, (1974) · Zbl 0281.62044
[67] Umlauf, N.; Adler, D.; Kneib, T.; Lang, S.; Zeileis, A., Structured additive regression models: an R interface to bayesx, Journal of Statistical Software, 63, 1-46, (2015)
[68] Wang, L.; Lin, X., A Bayesian approach for analyzing case 2 interval-censored data under the semiparametric proportional odds model, Statistics & Probability Letters, 81, 876-883, (2011) · Zbl 1219.62054
[69] Wang, L.; McMahan, C. S.; Hudgens, M. G.; Qureshi, Z. P., A flexible, computationally efficient method for Fitting the proportional hazards model to interval-censored data, Biometrics, 72, 222-231, (2016) · Zbl 1393.62105
[70] Wang, S.; Zhang, J.; Lawson, A. B., A Bayesian normal mixture accelerated failure time spatial model and its application to prostate cancer, Statistical Methods in Medical Research, 25, 793-806, (2016)
[71] Zhao, L.; Hanson, T. E., Spatially dependent polya tree modeling for survival data, Biometrics, 67, 391-403, (2011) · Zbl 1217.62197
[72] Zhao, L.; Hanson, T. E.; Carlin, B. P., Mixtures of polya trees for flexible spatial frailty survival modelling, Biometrika, 96, 263-276, (2009) · Zbl 1163.62079
[73] Zhou, H.; Hanson, T.; Mitra, R.; Müller, P., Nonparametric Bayesian Inference in Biostatistics, Bayesian spatial survival models, 215-246, (2015), Springer, Cham
[74] spBayesSurv: Bayesian Modeling and Analysis of Spatially Correlated Survival Data, R Package Version 1.1.3, (2018)
[75] Zhou, H.; Hanson, T.; Jara, A.; Zhang, J., Modeling county level breast cancer survival data using a covariate-adjusted frailty proportional hazards model, The Annals of Applied Statistics, 9, 43-68, (2015) · Zbl 1454.62430
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.