zbMATH — the first resource for mathematics

Deep learning with R. (English) Zbl 1411.68002
Singapore: Springer (ISBN 978-981-13-5849-4/hbk; 978-981-13-5850-0/ebook). xxiii, 245 p. (2019).
The book comprises of eight chapters and an epilogue; it is structured as a thorough introduction to the field with plenty of examples in R, that make it accessible to a wide range of audiences from computer scientists, statisticians to biologists.
In the first chapter the author reviews fundamental concepts of Machine Learning, underlining the differences between ML approaches and statistics, the bias-variance trade-off and some particular ML concepts such as overfitting, underfitting, regularisation and hyperparameter tuning. The maximum likelihood estimation methods are also included, together with entropy-based approaches for quantifying loss.
The second chapter introduces neural networks; following an overview of types of neural networks (these will be discussed in the subsequent chapters), the author presents in detail the constituent elements of neural networks such as the algorithm for feed-forward propagation, the role and characteristics of activation functions and their derivatives, and the role of the cost function. The back-propagation algorithm is also discussed. In the next chapter the deep neural networks are presented. It comprises of an in-depth description of the deep neural network (DNN) algorithm and a detailed description of the keras package in R. The following two chapters (Chapter 4, “Initialisation of network parameters” and Chapter 5, “Optimisation”) are focused on the key steps that drive the accuracy of DNNs (and NNs in general). The different initialisation strategies and approaches for dealing with NaNs are discussed in Chapter 4; the gradient descent approach, and the problem of vanishing gradient, together with several regularisation ideas are presented in Chapter 5. In the sixth chapter the several other parameters in keras are discussed (such as selecting the number of epochs and introducing batch normalisation); also tensorflow is introduced.
The next two chapters focus on two types of Neural Networks – convolutional neural networks (CNNs), in Chapter 7 and recurrent neural networks (RNNs) in Chapter 8. For the former, the description of the convolution operation is followed an overview of frequently used architectures including the single layer convolutional network and specialised architectures such as LeNet-5, AlexNet and others. In the eight chapter the basic concepts introduced earlier (e.g. the back-propagation) are presented in the RNN context; in addition the long short-term memory (LSTM) architecture is presented from a theoretical perspective, and with examples from text generation tasks. Further examples from natural language processing conclude the chapter.
The book concludes with an epilogue chapter comprising of more thoughts on the topic. The book also has an extensive set of curated references that represent a good starting point for further reading. The examples in R, and the intertwined theory and example style recommend the book as a starting point for studying the Deep Learning field.
68-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to computer science
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI