zbMATH — the first resource for mathematics

Examples
Geometry Search for the term Geometry in any field. Queries are case-independent.
Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact.
"Topological group" Phrases (multi-words) should be set in "straight quotation marks".
au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted.
Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff.
"Quasi* map*" py: 1989 The resulting documents have publication year 1989.
so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14.
"Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic.
dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles.
py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses).
la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

Operators
a & b logic and
a | b logic or
!ab logic not
abc* right wildcard
"ab c" phrase
(ab c) parentheses
Fields
any anywhere an internal document identifier
au author, editor ai internal author identifier
ti title la language
so source ab review, abstract
py publication year rv reviewer
cc MSC code ut uncontrolled term
dt document type (j: journal article; b: book; a: book article)
Greedy function approximation: A gradient boosting machine. (English) Zbl 1043.62034
Summary: Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest-descent minimization. A general gradient descent “boosting” paradigm is developed for additive expansions based on any fitting criterion. Specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression [{\it P. Huber}, Ann. Math. Stat. 35, 73--101 (1964; Zbl 0136.39805)], and multiclass logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are regression trees, and tools for interpreting such “TreeBoost” models are presented. Gradient boosting of regression trees produces competitive, highly robust, interpretable procedures for both regression and classification, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods of {\it Y. Freund} and {\it R. E. Shapire} [see J. Comput. Syst. Sci. 55, 119--139 (1997; Zbl 0880.68103)] and {\it J. Friedman, T. Hastie} and {\it R. Tibshirani} [Ann. Stat. 28, 337--407 (2000; Zbl 1106.62323)] are discussed.

MSC:
62G08Nonparametric regression
62-07Data analysis (statistics)
65C60Computational problems in statistics
62K10Statistical block designs
WorldCat.org
Full Text: DOI
References:
[1] Becker, R. A. and Cleveland, W. S (1996). The design and control ofTrellis display. J. Comput. Statist. Graphics 5 123-155.
[2] Breiman, L. (1997). Pasting bites together for prediction in large data sets and on-line. Technical report, Dept. Statistics, Univ. California, Berkeley.
[3] Breiman, L. (1999). Prediction games and arcing algorithms. Neural Comp. 11 1493-1517.
[4] Breiman, L., Friedman, J. H., Olshen, R. and Stone, C. (1983). Classification and Regression Trees. Wadsworth, Belmont, CA. · Zbl 0541.62042
[5] Copas, J. B. (1983). Regression, prediction, and shrinkage (with discussion). J. Roy. Statist. Soc. Ser. B 45 311-354. JSTOR: · Zbl 0532.62048 · http://links.jstor.org/sici?sici=0035-9246%281983%2945%3A3%3C311%3ARPAS%3E2.0.CO%3B2-T&origin=euclid
[6] Donoho, D. L. (1993). Nonlinear wavelete methods for recovery of signals, densities, and spectra from indirect and noisy data. In Different Perspectives on Wavelets. Proceedings of Symposium in Applied Mathematics (I. Daubechies, ed.) 47 173-205. Amer. Math. Soc., Providence RI. · Zbl 0786.62094
[7] Drucker, H. (1997). Improving regressors using boosting techniques. Proceedings of Fourteenth International Conference on Machine Learning (D. Fisher, Jr., ed.) 107-115. MorganKaufmann, San Francisco.
[8] Duffy, N. and Helmbold, D. (1999). A geometric approach to leveraging weak learners. In Computational Learning Theory. Proceedings of 4th European Conference EuroCOLT99 (P. Fischer and H. U. Simon, eds.) 18-33. Springer, New York. · Zbl 0997.68166 · doi:10.1016/S0304-3975(01)00083-4
[9] Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference 148-156. Morgan Kaufman, San Francisco.
[10] Friedman, J. H. (1991). Multivariate adaptive regression splines (with discussion). Ann. Statist. 19 1-141. · Zbl 0765.62064 · doi:10.1214/aos/1176347963
[11] Friedman J. H., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: a statistical view ofboosting (with discussion). Ann. Statist. 28 337-407. · Zbl 1106.62323 · doi:10.1214/aos/1016218223 · euclid:aos/1016218223
[12] Griffin, W. L., Fisher, N. I., Friedman J. H., Ryan, C. G. and O’Reilly, S. (1999). Cr-Pyrope garnets in lithospheric mantle. J. Petrology. 40 679-704.
[13] Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. Chapman and Hall, London. · Zbl 0747.62061
[14] Huber, P. (1964). Robust estimation ofa location parameter. Ann. Math. Statist. 35 73-101. · Zbl 0136.39805 · doi:10.1214/aoms/1177703732
[15] Mallat, S. and Zhang, Z. (1993). Matching pursuits with time frequency dictionaries. IEEE Trans. Signal Processing 41 3397-3415. · Zbl 0842.94004 · doi:10.1109/78.258082
[16] Powell, M. J. D. (1987). Radial basis functions for multivariate interpolation: a review. In Algorithms for Approximation (J. C. Mason and M. G. Cox, eds.) 143-167. Clarendon Press, Oxford. · Zbl 0638.41001
[17] Ratsch, G., Onoda, T. and Muller, K. R. (1998). Soft margins for AdaBoost. NeuroCOLT Technical Report NC-TR-98-021.
[18] Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge Univ. Press. · Zbl 0853.62046
[19] Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1986). Learning representations by backpropagating errors. Nature 323 533-536.
[20] Schapire, R. and Singer, Y. (1998). Improved boosting algorithms using confidence-rated predictions. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory. ACM, New York. · Zbl 0945.68194 · doi:10.1023/A:1007614523901
[21] Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New York. · Zbl 0833.62008
[22] Warner, J. R., Toronto, A. E., Veasey, L. R. and Stephenson, R. (1961). A mathematical model for medical diagnosis-application to congenital heart disease. J. Amer. Med. Assoc. 177 177-184.