Adaptive-modal Bayesian nonparametric regression. (English) Zbl 1335.62051

Summary: We introduce a novel, Bayesian nonparametric, infinite-mixture regression model. The model has unimodal kernel (component) densities, and has covariate-dependent mixture weights that are defined by an infinite ordered-category probits regression. Based on these mixture weights, the regression model predicts a probability density that becomes increasingly unimodal as the explanatory power of the covariate (vector) increases, and increasingly multimodal as this explanatory power decreases, while allowing the explanatory power to vary from one covariate (vector) value to another. The model is illustrated and compared against many other regression models in terms of predictive performance, through the analysis of many real and simulated data sets.


62F15 Bayesian inference
62C10 Bayesian problems; characterization of Bayes procedures
62G08 Nonparametric regression and quantile regression
62J12 Generalized linear models (logistic models)
65C60 Computational problems in statistics (MSC2010)
Full Text: DOI Euclid


[1] Agresti, A. (1996)., An introduction to categorical data analysis . John Wiley and Sons, New York. · Zbl 0868.62008
[2] Akaike, H. (1973). Information Theory and the an Extension of the Maximum Likelihood Principle. In, Second International Symposium On Information Theory (B. N. Petrov and F. Csaki, eds.) 267-281. Academiai Kiado, Budapest. · Zbl 0283.62006
[3] Albert, J. H. and Chib, S. (1993). Bayesian Analysis of Binary and Polychotomous Response Data., Journal of the American Statistical Association 88 669-679. · Zbl 0774.62031 · doi:10.2307/2290350
[4] Barbieri, M. and Berger, J. (2004). Optimal Predictive Model Selection., Annals of Statistics 32 870-897. · Zbl 1092.62033 · doi:10.1214/00905360400000023
[5] Barrientos, A. F., Jara, A. and Quintana, F. A. (2012). On the Support of MacEachern’s Dependent Dirichlet Processes and Extensions., Bayesian Analysis 7 277-310. · Zbl 1330.60067 · doi:10.1214/12-BA709
[6] Brunner, L. J. (1992). Bayesian nonparametric methods for data from a unimodal density., Statistics and Probability Letters 14 195-199. · Zbl 0806.62038 · doi:10.1016/0167-7152(92)90021-V
[7] Cepeda, E. and Gamerman, D. (2001). Bayesian modeling of variance heterogeneity in normal regression models., Brazilian Journal of Probability and Statistics 14 207-221. · Zbl 0983.62013
[8] Chipman, H., George, E. I. and McCulloch, R. E. (2010). BART: Bayesian Additive Regression Trees., Annals of Applied Statistics 4 266-298. · Zbl 1189.62066 · doi:10.1214/09-AOAS285
[9] Chipman, H. and McCulloch, R. (2010). BayesTree: Bayesian Methods for Tree Based Models R package version, 0.3-1.1.
[10] DeIorio, M., Müller, P., Rosner, G. L. and MacEachern, S. N. (2004). An ANOVA Model for Dependent Random Measures., Journal of the American Statistical Association 99 205-215. · Zbl 1089.62513 · doi:10.1198/016214504000000205
[11] Dunson, D. and Park, J. H. (2008). Kernel Stick Breaking Processes., Biometrika 95 307-323. · Zbl 1437.62448 · doi:10.1093/biomet/asn012
[12] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least Angle Regression., Annals of Statistics 32 407-499. · Zbl 1091.62054 · doi:10.1214/009053604000000067
[13] Ferguson, T. S. (1973). A Bayesian Analysis of Some Nonparametric Problems., Annals of Statistics 1 209-230. · Zbl 0255.62037 · doi:10.1214/aos/1176342360
[14] Friedman, J. H. (1991). Multivariate Adaptive Regression Splines (With Discussion)., Annals of Statistics 19 1-67. · Zbl 0765.62064 · doi:10.1214/aos/1176347963
[15] Friedman, J. H., Hastie, T. and Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent., Journal of Statistical Software 33 .
[16] Fuentes-García, R., Mena, R. H. and Walker, S. G. (2010). A New Bayesian Nonparametric Mixture Model., Communications In Statistics 39 669-682. · Zbl 1192.62105 · doi:10.1080/03610910903580963
[17] Gelfand, A. E. and Banerjee, S. (2010). Multivariate Spatial Process Models. In, Handbook of Spatial Statistics (A. E. Gelfand, P. Diggle, P. Guttorp and M. Fuentes, eds.) 495-515. Chapman and Hall/CRC, Boca Raton. · doi:10.1201/9781420072884-c28
[18] Gelfand, A. E. and Ghosh, J. K. (1998). Model Choice: A Minimum Posterior Predictive Loss Approach., Biometrika 85 1-11. · Zbl 0904.62036 · doi:10.1093/biomet/85.1.1
[19] Gelfand, A. E., Kottas, A. and MacEachern, S. N. (2005). Bayesian Nonparametric Spatial Modeling With Dirichlet Processes Mixing., Journal of the American Statistical Association 100 1021-1035. · Zbl 1117.62342 · doi:10.1198/016214504000002078
[20] Gelman, A., Jakulin, A., Pittau, M. and Su, Y. S. (2008). A Weakly Informative Default Prior Distribution for Logistic and Other Regression Models., The Annals of Applied Statistics 2 1360-1383. · Zbl 1156.62017 · doi:10.1214/08-AOAS191
[21] George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian Variable Selection., Statistica Sinica 7 339-373. · Zbl 0884.62031
[22] Gramacy, R. B. (2010). Monomvn: Estimation for multivariate normal and Student-t data with monotone missingness R package version, 1.8-3.
[23] Griffin, J. E. and Steel, M. F. J. (2006). Order-Based Dependent Dirichlet Processes., Journal of the American Statistical Association 101 179-194. · Zbl 1118.62360 · doi:10.1198/016214505000000727
[24] Gruen, B. and Leisch, F. (2007). Fitting finite mixtures of generalized linear regressions in R., Computational Statistics and Data Analysis 51 5247-5252. · Zbl 1445.62192
[25] Hanson, T. E. (2006). Inference for Mixtures of Finite Pólya Tree Models., Journal of the American Statistical Association 101 1548-1565. · Zbl 1171.62323 · doi:10.1198/016214506000000384
[26] Hastie, T. and Efron, B. (2007). Lars: Least Angle Regression, Lasso and Forward Stagewise R package version, 0.9-7.
[27] Hastie, T. and Tibshirani, R. (1990)., Generalized Additive Models . Chapman and Hall, London. · Zbl 0747.62061
[28] Holmes, C. C., Denison, D. G. T., Ray, S. and Mallick, B. K. (2005). Bayesian Prediction via Partitioning., Journal of Computational and Graphical Statistics 14 811-830. · doi:10.1198/106186005X78107
[29] Hwang, J., Lay, S., Maechler, R., Martin, D. and Schimert, J. (1994). Regression Modelling in Back-Propagation and Projection Pursuit Learning., IEEE Transactions of Neural Networks 5 342-353.
[30] Ibrahim, J. G., Chen, M. H. and Sinha, D. (2001). Criterion-based methods for Bayesian model assessment., Statistica Sinica 11 419-443. · Zbl 1037.62017
[31] Ibrahim, J. G. and Kleinman, K. P. (1998). Semiparametric Bayesian Methods for Random Effects Models. In, Practical Nonparametric and Semiparametric Bayesian Statistics. Lecture Notes in Statistics 133 (D. Dey, P. Müller and D. Sinha, eds.) 89-114. Springer-Verlag, New York. · Zbl 0954.62089 · doi:10.1007/978-1-4612-1732-9_5
[32] Ishwaran, H. and James, L. F. (2001). Gibbs Sampling Methods for Stick-Breaking Priors., Journal of the American Statistical Association 96 161-173. · Zbl 1014.62006 · doi:10.1198/016214501750332758
[33] Jara, A. and Hanson, T. (2011). A class of mixtures of dependent tail-free processes., Biometrika 98 553-566. · Zbl 1231.62178 · doi:10.1093/biomet/asq082
[34] Jara, A., Hanson, T. E., Quintana, F. A., Müller, P. and Rosner, G. L. (2011). DPpackage: Bayesian Semi- and Nonparametric Modeling in R., Journal of Statistical Software 40 1-20.
[35] Jones, G. L., Haran, M., Caffo, B. S. and Neath, R. (2006). Fixed-Width Output Analysis for Markov Chain Monte Carlo., Journal of the American Statistical Association 101 1537-1547. · Zbl 1171.62316 · doi:10.1198/016214506000000492
[36] Kalli, M., Griffin, J. and Walker, S. G. (2010). Slice Sampling Mixture Models., Statistics and Computing 21 93-105. · Zbl 1256.65006 · doi:10.1007/s11222-009-9150-y
[37] Kim, H., Loh, W. Y., Shih, Y. S. and Chaudhuri, P. (2007). Visualizable and interpretable regression models with good prediction power., IEEE Transactions: Special Issue on Data Mining and Web Mining 39 565-579.
[38] Kottas, A., Müller, P. and Quintana, F. (2005). Nonparametric Bayesian Modeling for Multivariate Ordinal Data., Journal of Computational and Graphical Statistics 14 610-625. · doi:10.1198/106186005X63185
[39] Laud, P. W. and Ibrahim, J. G. (1995). Predictive Model Selection., Journal of the Royal Statistical Society, Series B 57 247-262. · Zbl 0809.62024
[40] Lo, A. Y. (1984). On a Class of Bayesian Nonparametric Estimates., Annals of Statistics 12 351-357. · Zbl 0557.62036 · doi:10.1214/aos/1176346412
[41] MacEachern, S. N. (1999). Dependent Nonparametric processes., Proceedings of the Bayesian Statistical Sciences Section of the American Statistical Association 50-55.
[42] MacEachern, S. N. (2000). Dependent Dirichlet Processes Technical Report, Department of Statistics, The Ohio State, University.
[43] MacEachern, S. N. (2001). Decision Theoretic Aspects of Dependent Nonparametric Processes. In, Bayesian Methods with Applications to Science, Policy and Official Statistics (E. George, ed.) 551-560. International Society for Bayesian Analysis, Creta.
[44] Mallows, C. L. (1973). Some Comments on Cp., Technometrics 15 661-675. · Zbl 0269.62061 · doi:10.2307/1267380
[45] Milborrow, S. (2009). Earth: Multivariate Adaptive Regression Spline Models R package version, 2.4-0.
[46] Mukhopadhyay, S. and Gelfand, A. E. (1997). Dirichlet Process Mixed Generalized Linear Models., Journal of the American Statistical Association 92 633-639. · Zbl 0889.62062 · doi:10.2307/2965710
[47] Müller, P., Erkanli, A. and West, M. (1996). Bayesian Curve Fitting Using Multivariate Normal Mixtures., Biometrika 83 67-79. · Zbl 0865.62029 · doi:10.1093/biomet/83.1.67
[48] Müller, P. and Quintana, F. A. (2010). Random Partition Models with Regression on Covariates., Journal of Statistical Planning and Inference 140 2801-2808. · Zbl 1191.62073 · doi:10.1016/j.jspi.2010.03.002
[49] Müller, P., Quintana, F. A. and Rosner, G. L. (2011). A Product Partition Model with Regression on Covariates., Journal of Computational and Graphical Statistics 20 260-278. · doi:10.1198/jcgs.2011.09066
[50] Newton, M. A., Czado, C. and Chappell, R. (1996). Bayesian Inference for Semiparametric Binary Regression., Journal of the American Statistical Association 91 142-153. · Zbl 0870.62026 · doi:10.2307/2291390
[51] O’Hagan, A. and Forster, J. (2004)., Kendall’s Advanced Theory of Statistics: Bayesian Inference 2B . Arnold, London. · Zbl 1058.62002
[52] Park, Y. and Casella, G. (2008). The Bayesian LASSO., Journal of the American Statistical Association 103 681-686. · Zbl 1330.62292 · doi:10.1198/016214508000000337
[53] Park, J. H. and Dunson, D. B. (2010). Bayesian generalized product partition models., Statistica Sinica 20 1203-1226. · Zbl 1507.62242
[54] Perman, M., Pitman, J. and Yor, M. (1992). Size-biased sampling of Poisson point processes and excursions., Probability Theory and Related Fields 92 21-39. · Zbl 0741.60037 · doi:10.1007/BF01205234
[55] Pinheiro, J., Bates, D., DebRoy, S., Sarkar, D. and R Development Core Team (2010). Nlme: Linear and Nonlinear Mixed Effects Models R package version, 3.1-97.
[56] Polzehl, J. (2010). EDR: Estimation of the effective dimension reduction (EDR) space R package version 0.6-4.,
[57] Polzehl, J. and Sperlich, S. (2009). A note on structural adaptive dimension reduction., Journal of Statistical Computation and Simulation 79 805-818. · Zbl 1186.62084 · doi:10.1080/00949650801959699
[58] Robert, C. P. and Casella, G. (2004)., Monte Carlo Statistical Methods (Second Edition) . Springer, New York. · Zbl 1096.62003
[59] Rodriguez, A., Dunson, D. B. and Gelfand, A. E. (2008). The Nested Dirichlet Process., Journal of the American Statistical Association 103 1131-1144. · Zbl 1205.62062 · doi:10.1198/016214508000000553
[60] Rodriguez, A. and Dunson, D. B. (2011). Nonparametric Bayesian models through probit stick-breaking processes., Bayesian Analysis 6 1-34. · Zbl 1330.62120 · doi:10.1214/11-BA605
[61] Sethuraman, J. (1994). A Constructive Definition of Dirichlet Priors., Statistica Sinica 4 639-650. · Zbl 0823.62007
[62] Smyth, G. (2010). Statmod: Statistical modeling R package version, 1.4.6.
[63] R Development Core Team (2011)., R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing, Vienna, Austria.
[64] Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M. (2006). Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes., Journal of the American Statistical Association 101 1566-1581. · Zbl 1171.62349 · doi:10.1198/016214506000000302
[65] Tokdar, S. T., Zhu, Y. M. and Ghosh, J. K. (2010). Density regression with logistic Gaussian process priors and subspace projection., Bayesian Analysis 5 316-344. · Zbl 1330.62182 · doi:10.1214/10-BA605
[66] Walker, S. G. and Karabatsos, G. (2012). Revisiting Bayesian curve fitting using multivariate normal mixtures. In, Bayesian Theory and Applications (P. Damien, P. Dellaportas, N. Polson and D. Stephens, eds.) 297-305. Oxford University Press, New York.
[67] Wood, S. N. (2004). Stable and Efficient Multiple Smoothing Parameter Estimation for Generalized Additive Models., Journal of the American Statistical Association 99 673-686. · Zbl 1117.62445 · doi:10.1198/016214504000000980
[68] Wood, S. N. (2010). GAMs with GCV/AIC/REML Smoothness Estimation and GAMMs by PQL: mgcv Package Documentation for the R Software, R Foundation for Statistical Computing, Vienna, Austria.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.