An introduction to clustering with R. (English) Zbl 1459.62002

Behaviormetrics: Quantitative Approaches to Human Behavior 1. Singapore: Springer (ISBN 978-981-13-0552-8/hbk; 978-981-13-0553-5/ebook). xvii, 340 p. (2020).
Cluster analysis is a well established statistical technique with applications in a large spectrum of research fields. The present book belongs to the series “Behaviormetrics”, and thus there is special attention to the applications to social sciences. This book is primarily designed for applied scientists, because the theoretical derivations are omitted in the discussion and the mathematical background needed to read the book has been kept at a basic level. Moreover, the presentation is mainly based on extended examples and case studies. In all the examples discussed in the book, the relevant R code is illustrated with full details. In addition, the R scripts are available on the web page of the authors and the datasets are collected in a package. Finally, this book can be used by practictioners in applied sciences, or as a textbook in an introductory course in multivariate statistics.
The book is divided into three parts, covering the most relevant classes of clustering methods: standard methods, fuzzy methods, and model-based methods. The main techniques in each class are briefly introduced and then applied to several case studies. Below are some points which identify this book among the other ones devoted to cluster analysis.
The book has a great emphasis on categorical data. This is particularly welcome, because in many textbooks the quantitative continuous case is considered as the default, and some insights on how to analyze categorical data is relegated to few lines in some remarks.
A lot of space is devoted to soft clustering methods, since in behavioural sciences a soft clustering solution is often more flexible and easily interpretable than an hard clustering.
There are some pointers to the analysis of big data.


62-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H86 Multivariate analysis and fuzziness
62P25 Applications of statistics to social sciences
62-04 Software, source code, etc. for problems pertaining to statistics
Full Text: DOI