Continuous-time birth-death MCMC for Bayesian regression tree models.

*(English)*Zbl 07306891Summary: Decision trees are flexible models that are well suited for many statistical regression problems. In the Bayesian framework for regression trees, Markov Chain Monte Carlo (MCMC) search algorithms are required to generate samples of tree models according to their posterior probabilities. The critical component of such MCMC algorithms is to construct “good” Metropolis-Hastings steps to update the tree topology. Such algorithms frequently suffer from poor mixing and local mode stickiness; therefore, the algorithms are slow to converge. Hitherto, authors have primarily used discrete-time birth/death mechanisms for Bayesian (sums of) regression tree models to explore the tree-model space. These algorithms are efficient, in terms of computation and convergence, only if the rejection rate is low which is not always the case. We overcome this issue by developing a novel search algorithm which is based on a continuous-time birth-death Markov process. The search algorithm explores the tree-model space by jumping between parameter spaces corresponding to different tree structures. The jumps occur in continuous time corresponding to the birth-death events which are modeled as independent Poisson processes. In the proposed algorithm, the moves between models are always accepted which can dramatically improve the convergence and mixing properties of the search algorithm. We provide theoretical support of the algorithm for Bayesian regression tree models and demonstrate its performance in a simulated example.

Reviewer: Reviewer (Berlin)

##### MSC:

68T05 | Learning and adaptive systems in artificial intelligence |

##### Keywords:

Bayesian regression trees; decision trees; continuous-time MCMC; Bayesian structure learning; birth-death process; Bayesian model averaging; Bayesian model selection##### Software:

BDgraph
PDF
BibTeX
XML
Cite

\textit{R. Mohammadi} et al., J. Mach. Learn. Res. 21, Paper No. 201, 26 p. (2020; Zbl 07306891)

Full Text:
Link

##### References:

[1] | Shipra Agrawal and Navin Goyal. Analysis of thompson sampling for the multi-armed bandit problem. InConference on Learning Theory, pages 39-1, 2012. |

[2] | Timothy C Au. Random forests, decision trees, and categorical predictors: the absent levels problem.The Journal of Machine Learning Research, 19(1):1737-1766, 2018. · Zbl 06982336 |

[3] | GAŠrard Biau. Analysis of a random forests model.Journal of Machine Learning Research, 13(Apr):1063-1095, 2012. · Zbl 1283.62127 |

[4] | GÃŠrard Biau, Luc Devroye, and GÃĄbor Lugosi. Consistency of random forests and other averaging classifiers.Journal of Machine Learning Research, 9(Sep):2015-2033, 2008. · Zbl 1225.62081 |

[5] | Leo Breiman, Jerome H Friedman, Richard A Olshen, and Charles J Stone. Classification and regression trees. 1984. · Zbl 0541.62042 |

[6] | Olivier Cappé, Christian P Robert, and Tobias Rydén. Reversible jump, birth-and-death and more general continuous time markov chain monte carlo samplers.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(3):679-700, 2003. · Zbl 1063.62133 |

[7] | Hugh A Chipman, Edward I George, and Robert E McCulloch. Bayesian cart model search. Journal of the American Statistical Association, 93(443):935-948, 1998. |

[8] | Hugh A Chipman, Edward I George, and Robert E McCulloch. Bayesian treed models. Machine Learning, 48(1):299-320, 2002. · Zbl 0998.68072 |

[9] | Hugh A Chipman, Edward I George, Robert E McCulloch, et al. Bart: Bayesian additive regression trees.The Annals of Applied Statistics, 4(1):266-298, 2010. · Zbl 1189.62066 |

[10] | David A Cohn, Zoubin Ghahramani, and Michael I Jordan. Active learning with statistical models.Journal of artificial intelligence research, 1996. |

[11] | Glenn De’ath and Katharina E Fabricius. Classification and regression trees: a powerful yet simple technique for ecological data analysis.Ecology, 81(11):3178-3192, 2000. |

[12] | David GT Denison, Bani K Mallick, and Adrian FM Smith. A bayesian cart algorithm. Biometrika, 85(2):363-377, 1998. · Zbl 1048.62502 |

[13] | Adrian Dobra and Reza Mohammadi. Loglinear model selection and human mobility.The Annals of Applied Statistics, 12(2):815-845, 2018. · Zbl 1405.62065 |

[14] | Dean Eckles and Maurits Kaptein. Thompson sampling with the online bootstrap.arXiv preprint arXiv:1410.4009, 2014. |

[15] | Dean Eckles and Maurits Kaptein. Bootstrap thompson sampling and sequential decision problems in the behavioral sciences.Sage Open, 9(2):1-12, 2019. |

[16] | John Gittins, Kevin Glazebrook, and Richard Weber.Multi-armed bandit allocation indices. John Wiley & Sons, 2011. · Zbl 1401.90257 |

[17] | Robert B Gramacy and Herbert K H Lee. Bayesian treed gaussian process models with an application to computer modeling.Journal of the American Statistical Association, 103 (483):1119-1130, 2008. · Zbl 1205.62218 |

[18] | Peter J Green. Reversible jump markov chain monte carlo computation and bayesian model determination.Biometrika, 82(4):711-732, 1995. · Zbl 0861.62023 |

[19] | Max Hinne, Alex Lenkoski, Tom Heskes, and Marcel van Gerven. Efficient sampling of gaussian graphical models using conditional bayes factors.Stat, 3(1):326-336, 2014. |

[20] | Balaji Lakshminarayanan, Daniel Roy, and Yee Whye Teh. Top-down particle filtering for bayesian decision trees. InInternational Conference on Machine Learning, pages 280-288, 2013. |

[21] | Antonio R Linero. Bayesian regression trees for high-dimensional prediction and variable selection.Journal of the American Statistical Association, 113(522):626-636, 2018. · Zbl 1398.62065 |

[22] | Brent R Logan, Rodney Sparapani, Robert E McCulloch, and Purushottam W Laud. Decision making and uncertainty quantification for individualized treatments using bayesian additive regression trees.Statistical methods in medical research, 28(4):1079-1093, 2019. |

[23] | Abdolreza Mohammadi and Ernst C Wit. Bayesian structure learning in sparse Gaussian graphical models.Bayesian Analysis, 10(1):109-138, 2015. · Zbl 1335.62056 |

[24] | Abdolreza Mohammadi, Mohammadi-Rreza Salehi-Rad, and Ernst C Wit. Using mixture of Gamma distributions for Bayesian analysis in an M/G/1 queue with optional second service.Computational Statistics, 28(2):683-700, 2013. · Zbl 1305.65064 |

[25] | Abdolreza Mohammadi, Fentaw Abegaz, Edwin van den Heuvel, and Ernst C. Wit. Bayesian modeling of Dupuytren disease using Gaussian copula graphical models.Journal of the Royal Statistical Society: Series C (Applied Statistics), 66(3):629-645, 2017a. |

[26] | Reza Mohammadi and Ernst C Wit. BDgraph: An R package for Bayesian structure learning in graphical models.Journal of Statistical Software, 89(3):1-30, 2019. |

[27] | Reza Mohammadi, Helene Massam, and Gerard Letac. The ratio of normalizing constants for bayesian graphical gaussian model selection.arXiv preprint arXiv:1706.04416, 110: 116, 2017b. |

[28] | Anantha M Prasad, Louis R Iverson, and Andy Liaw. Newer classification and regression tree techniques: bagging and random forests for ecological prediction.Ecosystems, 9(2): 181-199, 2006. |

[29] | Matthew T Pratola. Efficient metropolis-hastings proposal mechanisms for bayesian regression tree models.Bayesian analysis, 11(3):885-911, 2016. · Zbl 1357.62178 |

[30] | Matthew T Pratola, Hugh A Chipman, James R Gattiker, David M Higdon, Robert McCulloch, and William N Rust. Parallel bayesian additive regression trees.Journal of Computational and Graphical Statistics, 23(3):830-852, 2014. |

[31] | Chris Preston. Spatial birth-and-death processes.Bulletin of the International Statistical Institute, 46:371-391, 1977. |

[32] | Philipp Probst and Anne-Laure Boulesteix. To tune or not to tune the number of trees in random forest.Journal of Machine Learning Research, 18:181-1, 2017. · Zbl 06982937 |

[33] | Herbert Robbins. Some aspects of the sequential design of experiments. InHerbert Robbins Selected Papers, pages 169-177. Springer, 1985. |

[34] | Christian Robert.The Bayesian choice: from decision-theoretic foundations to computational implementation. Springer Science & Business Media, 2007. · Zbl 1129.62003 |

[35] | Daniel D Sleator, Robert E Tarjan, and William P Thurston. Rotation distance, triangulations, and hyperbolic geometry.Journal of the American Mathematical Society, 1(3): 647-681, 1988. · Zbl 0653.51017 |

[36] | Matthew Stephens. Bayesian analysis of mixture models with an unknown number of components-an alternative to reversible jump methods.Annals of statistics, pages 40- 74, 2000. · Zbl 1106.62316 |

[37] | Matthew A Taddy, Robert B Gramacy, and Nicholas G Polson. Dynamic trees for learning and design.Journal of the American Statistical Association, 106(493):109-123, 2011. · Zbl 1396.62158 |

[38] | William R Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples.Biometrika, 25(3/4):285-294, 1933. · JFM 59.1159.03 |

[39] | Nanwei Wang, Laurent Briollais, and Helene Massam. The scalable birth-death mcmc algorithm for mixed graphical model learning with application to genomic data integration. arXiv preprint arXiv:2005.04139, 2020. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.