The nature of statistical learning theory.

*(English)*Zbl 0833.62008
Berlin: Springer-Verlag. xv, 188 p. (1995).

The goal of this book is to describe the nature of statistical learning theory which includes (i) Concepts describing the necessary and sufficient conditions for consistency of inference, (ii) Bounds describing the generalizing ability of learning machines, (iii) Inductive inference for small sample sizes, (iv) Methods for implementing this new type of inference. The author’s main target is to show how the abstract reasoning implies new algorithms. He concentrates on understanding the nature of the problem and the new theory as a whole.

The contents of the book show an introduction, five chapters, informal reasoning and comments, respectively. The introduction describes the history of research of the learning problem which can be separated into four periods. Ch. 1 is of fundamental importance for understanding the new philosophy and the concepts for constructing necessary and sufficient conditions for consistency of the learning process. This is considered as a problem of finding a desired dependence using a limited number of observations. Regression estimation, risk minimization, pattern recognition, and density estimation yield different formulations of the learning problem.

Ch. 2 concentrates on consistency as part of the learning theory. A conceptual model for learning processes that are based on the Empirical Risk Minimization (ERM) inductive principle is described. For better understanding the ERM principle theorems about the idea of nonfalsifiability are given. Concepts that allow the construction of the consistency conditions mentioned above are discussed.

Ch. 3 describes the nonasymptotic theory of bounds on the convergence rate of the learning processes. The bounds are based on two different capacity concepts described in the conceptual model of Ch. 2: The Annealed Entropy function and the Growth function which result in distribution-dependent bounds and distribution-independent bounds, respectively.

Ch. 4 is devoted to small sample sizes. To construct small sample sizes the bounds for the generalization ability of learning machines with sets of unbounded functions are used. MDL (Minimum Description Length) and SRM (Structural Risk Minimization) principles of inductive inference for small sample sizes are discussed under the aspect of algorithmic complexity.

Ch. 5 describes learning algorithms for pattern recognition where the classical concepts of neural networks are considered. A new type of universal learning machine, the Support Vector Machine which realizes the SRM inductive principle is introduced. Finally, in the conclusion some open problems of learning theory are discussed.

The book is written in a well readable and concise style with special emphasis on the practical power of abstract reasoning. It is to recommend to students, engineers and scientists of different backgrounds like statisticians, mathematicians, physicists, and computer scientists.

The contents of the book show an introduction, five chapters, informal reasoning and comments, respectively. The introduction describes the history of research of the learning problem which can be separated into four periods. Ch. 1 is of fundamental importance for understanding the new philosophy and the concepts for constructing necessary and sufficient conditions for consistency of the learning process. This is considered as a problem of finding a desired dependence using a limited number of observations. Regression estimation, risk minimization, pattern recognition, and density estimation yield different formulations of the learning problem.

Ch. 2 concentrates on consistency as part of the learning theory. A conceptual model for learning processes that are based on the Empirical Risk Minimization (ERM) inductive principle is described. For better understanding the ERM principle theorems about the idea of nonfalsifiability are given. Concepts that allow the construction of the consistency conditions mentioned above are discussed.

Ch. 3 describes the nonasymptotic theory of bounds on the convergence rate of the learning processes. The bounds are based on two different capacity concepts described in the conceptual model of Ch. 2: The Annealed Entropy function and the Growth function which result in distribution-dependent bounds and distribution-independent bounds, respectively.

Ch. 4 is devoted to small sample sizes. To construct small sample sizes the bounds for the generalization ability of learning machines with sets of unbounded functions are used. MDL (Minimum Description Length) and SRM (Structural Risk Minimization) principles of inductive inference for small sample sizes are discussed under the aspect of algorithmic complexity.

Ch. 5 describes learning algorithms for pattern recognition where the classical concepts of neural networks are considered. A new type of universal learning machine, the Support Vector Machine which realizes the SRM inductive principle is introduced. Finally, in the conclusion some open problems of learning theory are discussed.

The book is written in a well readable and concise style with special emphasis on the practical power of abstract reasoning. It is to recommend to students, engineers and scientists of different backgrounds like statisticians, mathematicians, physicists, and computer scientists.

Reviewer: R.Fahrion (Heidelberg)

##### MSC:

62B10 | Statistical aspects of information-theoretic topics |

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |

68T05 | Learning and adaptive systems in artificial intelligence |

62-01 | Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics |