Data clustering. Theory, algorithms, and applications. (English) Zbl 1185.68274

ASA-SIAM Series on Statistics and Applied Probability 20. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM) (ISBN 978-0-898716-23-8/pbk; 978-0-89871-834-8/ebook). xxii, 466 p. (2007).
Publisher’s description: Cluster analysis is an unsupervised process that divides a set of objects into homogeneous groups. This book starts with basic information on cluster analysis, including the classification of data and the corresponding similarity measures, followed by the presentation of over 50 clustering algorithms in groups according to some specific baseline methodologies such as hierarchical, center-based, and search-based methods. As a result, readers and users can easily identify an appropriate algorithm for their applications and compare novel ideas with existing results.
The book also provides examples of clustering applications to illustrate the advantages and shortcomings of different clustering architectures and algorithms. Application areas include pattern recognition, artificial intelligence, information technology, image processing, biology, psychology, and marketing. Readers also learn how to perform cluster analysis with the C/C++ and MATLAB programming languages.
Audience: The following groups will find this book a valuable tool and reference: applied statisticians; engineers and scientists using data analysis; researchers in pattern recognition, artificial intelligence, machine learning, and data mining; and applied mathematicians. Instructors can also use it as a textbook for an introductory course in cluster analysis or as source material for a graduate-level introduction to data mining.


68-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to computer science
62-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics
62-07 Data analysis (statistics) (MSC2010)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62Pxx Applications of statistics
68P05 Data structures
68P15 Database theory
68T10 Pattern recognition, speech recognition
68W05 Nonnumerical algorithms


Silhouettes; SOM; EDA; Matlab
Full Text: DOI