An introduction to statistical learning. With applications in R.

*(English)*Zbl 1281.62147
Springer Texts in Statistics 103. New York, NY: Springer (ISBN 978-1-4614-7137-0/hbk; 978-1-4614-7138-7/ebook). xiv, 426 p. (2013).

The book is structured in ten chapters covering tools for modeling and mining of complex real life data sets. The first chapter consists of an overview to basic mathematical notions used throughout the book. The second chapter is built as a general introduction to statistical learning, describing general terms which will be presented in full detail in subsequent chapters. It also consists of basic commands and concepts of R useful for the understanding of the examples which accompany the theoretical notions. The third chapter presents simple and multiple linear regression. The estimation of coefficients and the assessment of the accuracy of the model are also introduced. Other aspects such as qualitative prediction and extensions of linear modelsare discussed. The chapter concludes with a comparison of linear regression with the k-nearest-neighbors method and exercises in R to emphasize the important theoretical concepts which were discussed. The fourth chapter presents classification as a method able to improve (in certain conditions) the results obtained using linear regression. The authors present logistic regression and linear discriminant analysis using Bayes’ theorem for classification. The chapter ends with a comparison of classification methods and exercises to underline the strengths and weaknesses of logistic regression, LDA, QDA and KNN.

The fifth chapter overviews resampling methods including cross validation and bootstrapping. The leave one out and k fold cross validation as well as the bias-variance trade off are presented in detail and are accompanied by examples in R and exercises. The sixth chapter overviews linear model selection and regularization methods. The subset selection and shrinkage and dimension reduction methods are discussed in detail. Particular emphasis goes to high dimensional data and to regression and interpretation of the results in such circumstances. The exercises focus on forward and backward stepwise selection, on selection of the best model using cross validation techniques, on ridge regression and the lasso method and on regression on principal components. The seventh chapter goes beyond linearity and introduces polynomial regression and step and basis functions. Regression splines are also presented and accompanied by methods for smoothing splines. The chapter concludes with local regression and generalized additive models. The eighth chapter is focused on tree based models discussing the basics of decision trees and introducing higher level classifying methods such as bagging, random forests and boosting. The nine th chapter presents support vector machines by introducing maximal margin classifiers and classification using a hyperplane. Next, the support vector classifiers and support vector machines are discussed in detail. Next, SVMs with more than two classes are introduced and one versus one and one versus all classifiers are described. The exercises are based on real life examples of gene expression data and introduce additional concepts such as ROC curves. The tenth chapter presents unsupervised learning focusing on principal component analysis and clustering methods (in particular k-means and hierarchical clustering). The book is built in a series of highly interconnected chapters which can be followed independently. The style is suitable for undergraduates and researches alike and the understanding of concepts is facilitated by the exercises, both practical and theoretical, which accompany every chapter.

The fifth chapter overviews resampling methods including cross validation and bootstrapping. The leave one out and k fold cross validation as well as the bias-variance trade off are presented in detail and are accompanied by examples in R and exercises. The sixth chapter overviews linear model selection and regularization methods. The subset selection and shrinkage and dimension reduction methods are discussed in detail. Particular emphasis goes to high dimensional data and to regression and interpretation of the results in such circumstances. The exercises focus on forward and backward stepwise selection, on selection of the best model using cross validation techniques, on ridge regression and the lasso method and on regression on principal components. The seventh chapter goes beyond linearity and introduces polynomial regression and step and basis functions. Regression splines are also presented and accompanied by methods for smoothing splines. The chapter concludes with local regression and generalized additive models. The eighth chapter is focused on tree based models discussing the basics of decision trees and introducing higher level classifying methods such as bagging, random forests and boosting. The nine th chapter presents support vector machines by introducing maximal margin classifiers and classification using a hyperplane. Next, the support vector classifiers and support vector machines are discussed in detail. Next, SVMs with more than two classes are introduced and one versus one and one versus all classifiers are described. The exercises are based on real life examples of gene expression data and introduce additional concepts such as ROC curves. The tenth chapter presents unsupervised learning focusing on principal component analysis and clustering methods (in particular k-means and hierarchical clustering). The book is built in a series of highly interconnected chapters which can be followed independently. The style is suitable for undergraduates and researches alike and the understanding of concepts is facilitated by the exercises, both practical and theoretical, which accompany every chapter.

Reviewer: Irina Ioana Mohorianu (Norwich)

##### MSC:

62H30 | Classification and discrimination; cluster analysis (statistical aspects) |

62-04 | Software, source code, etc. for problems pertaining to statistics |

62J12 | Generalized linear models (logistic models) |

62H25 | Factor analysis and principal components; correspondence analysis |

62-01 | Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics |

68T05 | Learning and adaptive systems in artificial intelligence |