×

Classification and categorical inputs with treed Gaussian process models. (English) Zbl 1226.62057

Summary: Recognizing the successes of treed Gaussian process (TGP) models as an interpretable and thrifty model for nonparametric regression, we seek to extend the model to classification. Both treed models and Gaussian processes (GPs) have, separately, enjoyed great success in application to classification problems. An example of the former is Bayesian CART. In the latter, real-valued GP output may be utilized for classification via latent variables, which provides classification rules by means of a softmax function. We formulate a Bayesian model averaging scheme to combine these two models and describe a Monte Carlo method for sampling from the full posterior distribution with joint proposals for the tree topology and the GP parameters corresponding to latent variables at the leaves. We concentrate on efficient sampling of the latent variables, which is important to obtain good mixing in the expanded parameter space. The tree structure is particularly helpful for this task and also for developing an efficient scheme for handling categorical predictors, which commonly arise in classification problems. Our proposed classification TGP (CTGP) methodology is illustrated on a collection of synthetic and real data sets. We assess performance relative to existing methods and thereby show how CTGP is highly flexible, offers tractable inference, produces rules that are easy to interpret, and performs well out of sample.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62F15 Bayesian inference
65C05 Monte Carlo methods
62G08 Nonparametric regression and quantile regression
62M99 Inference from stochastic processes
05C05 Trees

Software:

R; rpart; UCI-ml; e1071; tgp; nnet
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] ABRAHAMSEN, P. (1997), ”A Review of Gaussian Random Fields and Correlation Functions”, Technical Report 917, Norwegian Computing Center, Oslo, Norway.
[2] ASUNCION,A., and NEWMAN,D.J. (2007), ”UCIMachine Learning Repository”, School of Information and Computer Sciences, University of California, Irvine, http://www.ics.uci.edu/\(\sim\)mlearn/MLRepository.html .
[3] BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R., and STONE, C. (1984), Classification and Regression Trees, Belmont, CA: Wadsworth. · Zbl 0541.62042
[4] CHIPMAN, H.A., GEORGE, E.I., and MCCULLOCH, R.E. (1998), ”Bayesian CART Model Search (With Discussion)”, Journal of the American Statistical Association, 93, 935–960.
[5] DIMITRIADOU, E., HORNIK, K., LEISCH, F., MEYER, D., and WEINGESSEL, A. (2010), ”e1071: Misc Functions of the Department of Statistics (e1071), TU Wien”, Version 1.5-24, CRAN repository, maintained by Friedrich Leisch, obtained from http://cran.r-project.org/web/packages/e1071/e1071.pdf .
[6] FRIEDMAN, J.H. (1991), ”Multivariate Adaptive Regression Splines”, Annals of Statistics, 19(1), 1–67. · Zbl 0765.62064
[7] GILKS, W.R., and WILD, P.(1992), ”Adaptive Rejection Sampling for Gibbs Sampling”, Applied Statistics, 41, 337–348. · Zbl 0825.62407
[8] GRAMACY, R.B. (2005), Bayesian Treed Gaussian Process Models, Santa Cruz: University of California.
[9] GRAMACY, R.B. (2007), ”tgp: An R Package for Bayesian Nonstationary, Semiparametric Nonlinear Regression and Design by Treed Gaussian Process Models”, Journal of Statistical Software, 19(9), 1548–7660.
[10] GRAMACY, R.B., and LEE, H.K.H. (2009), ”Adaptive Design and Analysis of Supercomputer Experiment”, Technometrics, 51(2), 130–145.
[11] GRAMACY, R.B., and TADDY, M.A. (2008), ”tgp: Bayesian Treed Gaussian Process Models”, R Package Version 2.1-2, http://www.ams.ucsc.edu/rbgramacy/tgp.html .
[12] GRAMACY, R.B., and TADDY, M.A. (2009), ”Categorical Inputs, Sensitivity Analysis, Optimization and Importance Tempering with tgp”, Version 2, an R Package for Treed Gaussian Process Models, University of Cambridge, http://www.cran.r-project.org/web/packages/tgp/vignettes/tgp2.pdf , submitted to the Journal of Statistical Software.
[13] HASTIE, T., TIBSHIRANI, R., and FRIEDMAN, J. (2001), The Elements of Statistical Learning, New York, NY: Springer-Verlag. · Zbl 0973.62007
[14] MAT’ERN, B. (1986), Spatial Variation (2nd ed.), New York: Springer-Verlag. · Zbl 0608.62122
[15] NEAL, R.M. (1997), ”Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification”, Technical Report 9702, Department of Statistics, University of Toronto.
[16] NEAL, R.M. (1998), ”Regression and Classification Using Gaussian Process Priors (with Discussion)”, in Bayesian Statistics 6, eds. J.M. Bernardo, et al., Oxford: Oxford University Press, pp. 476–501. · Zbl 0974.62072
[17] QIAN, Z.G., WU, H., and WU, C.F.J. (2008), ”Gaussian Process Models for Computer Experiments with Qualitative and Quantitative Factors”, Technometrics, 50, 383–396.
[18] RASMUSSEN, C.E., and WILLIAMS, C. (2006), Gaussian Processes for Machine Learning, Cambridge, MA: MIT Press. · Zbl 1177.68165
[19] RICHARDSON, S., and GREEN, P.J. (1997), ”On Bayesian Analysis of Mixtures with an Unknown Number of Components”, Journal of the Royal Statistical Society, Series B, Methodological, 59, 731–758. · Zbl 0891.62020
[20] RIPLEY, B. (2009), ”Feed-forward Neural Networks and Multinomial Log-Linear Models”, Version 7.3-1, CRAN repository, maintained by Brian Ripley, obtained from http://cran.r-project.org/web/packages/nnet/nnet.pdf .
[21] SALTELLI,A., RATTO,M., ANDRES, T. CAMPOLONGO, F., CARIBONI, J., GATELLI, D., SAISANA, M., and TARANTOLA, S. (2008), Global Sensitivity Analysis: The Primer, Chichester, West Sussex: John Wiley & Sons. · Zbl 1161.00304
[22] R DEVELOPMENT CORE TEAM (2008), ”R: A Language and Environment for Statistical Computing”, R Foundation for Statistical Computing, Vienna, ISBN 3-900051-00-3, http://www.R-project.org .
[23] STEIN, M.L. (1999), Interpolation of Spatial Data, New York, NY: Springer. · Zbl 0924.62100
[24] THERNEAU, T.M., and ATKINSON, B. (2010), ”rpart: Recursive Partitioning”, Version 3.1-46, CRAN repository, maintained by Brian Ripley, obtained from http://cran.rproject.org/web/packages/rpart/index.html .
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.