Handbook of Cluster Analysis provides a comprehensive and unified account of the main research developments in cluster analysis. Written by active, distinguished researchers in this area, the book helps readers make informed choices of the most suitable clustering approach for their problem and make better use of existing cluster analysis tools.
The book is organized according to the traditional core approaches to cluster analysis, from the origins to recent developments. After an overview of approaches and a quick journey through the history of cluster analysis, the book focuses on the four major approaches to cluster analysis. These approaches include methods for optimizing an objective function that describes how well data is grouped around centroids, dissimilarity-based methods, mixture models and partitioning models, and clustering methods inspired by nonparametric density estimation. The book also describes additional approaches to cluster analysis, including constrained and semi-supervised clustering, and explores other relevant issues, such as evaluating the quality of a cluster.
This handbook is accessible to readers from various disciplines, reflecting the interdisciplinary nature of cluster analysis. For those already experienced with cluster analysis, the book offers a broad and structured overview. For newcomers to the field, it presents an introduction to key issues. For researchers who are temporarily or marginally involved with cluster analysis problems, the book gives enough algorithmic and practical details to facilitate working knowledge of specific clustering areas.
Table of Contents
Cluster Analysis: An Overview
Christian M. Hennig and Marina Meila
A Brief History of Cluster Analysis
Quadratic Error and k-Means
K-Medoids and Other Criteria for Crisp Clustering
Foundations for Center-Based Clustering: Worst-Case Approximations and Modern Developments
Pranjal Awasthi and Maria Florina Balcan
Pedro Contreras and Fionn Murtagh
Methods Based on Probability Models
Mixture Models for Standard p-Dimensional Euclidean Data
Geoffrey J. McLachlan and Suren I. Rathnayake
Latent Class Models for Categorical Data
G. Celeux and Gérard Govaert
Dirichlet Process Mixtures and Nonparametric Bayesian Approaches to Clustering
Finite Mixtures of Structured Models
Marco Alfó and Sara Viviani
Jorge Caiado, Elizabeth Ann Maharaj, and Pierpaolo D’Urso
Clustering Functional Data
David B. Hitchcock and Mark C. Greenwood
Methods Based on Spatial Processes
Lisa Handl, Christian Hirsch, and Volker Schmidt
Significance Testing in Clustering
Hanwen Huang, Yufeng Liu, David Neil Hayes, Andrew Nobel, J.S. Marron, and Christian M. Hennig
Model-Based Clustering for Network Data
Thomas Brendan Murphy
Methods Based on Density Modes and Level Sets
A Formulation in Modal Clustering Based on Upper Level Sets
Clustering Methods Based on Kernel Density Estimators: Mean-Shift Algorithms
Miguel Á. Carreira-Perpiñán
Julia Handl and Joshua Knowles
Specific Cluster and Data Formats
Anil Jain, Rong Jin, and Radha Chitta
Clustering of Symbolic Data
A Survey of Consensus Clustering
Joydeep Ghosh and Ayan Acharya
Two-Mode Partitioning and Multipartitioning
Rough Set Clustering
Ivo Düntsch and Günther Gediga
Cluster Validation and Further General Issues
Method-Independent Indices for Cluster Validation and Estimating the Number of Clusters
Maria Halkidi, Michalis Vazirgiannis, and Christian M. Hennig
Criteria for Comparing Clusterings
Resampling Methods for Exploring Cluster Stability
Robustness and Outliers
L.A. García-Escudero, A. Gordaliza, C. Matrán, A. Mayo-Iscar, and Christian M. Hennig
Visual Clustering for Data Analysis and Graphical User Interfaces
Sébastien Déjean and Josiane Mothe
Clustering Strategy and Method Selection
Christian M. Hennig
Christian Hennig is a senior lecturer in the Department of Statistical Science at University College London. Dr. Hennig is currently secretary of the International Federation of Classification Societies and associate editor of Statistics and Computing, Computational Statistics and Data Analysis, Advances in Data Analysis and Classification, and Statistical Methods and Applications. His main research interests are cluster analysis, philosophy of statistics, robust statistics, multivariate analysis, data visualization, and model selection.
Marina Meila is a professor of statistics at the University of Washington. She earned a PhD in computer science and electrical engineering from the Massachusetts Institute of Technology. Her long-term interest is in machine learning and reasoning in uncertainty and how these can be performed efficiently on large, complex data sets.
Fionn Murtagh is a professor of data science at University of Derby and Goldsmiths University of London. Dr. Murtagh is a fellow of the International Association for Pattern Recognition, a fellow of the British Computer Society, an elected member of the Royal Irish Academy and Academia Europaea, a member of the editorial boards of many journals, and editor-in-chief of the Computer Journal. His research interests encompass data science and big data analytics.
Roberto Rocci is a professor of statistics in the Department of Economics and Finance at the University of Rome Tor Vergata. Dr. Rocci is associate editor of the Statistical Methods and Applications Journal and board member of the SIS-CLassification and Data Analysis Group (SIS-CLADAG). His research interests include cluster analysis, mixture models, and latent variable models.
"The Handbook of Cluster Analysis provides a readable and fairly thorough overview of the highly interdisciplinary and growing field of cluster analysis. The editors rose to the challenge of the Handbook of Modern Statistical Methods series to balance well-developed methods with state-of-the-art research. The book is a collection of papers about how to find groups within data, each written by prominent researchers from computer science, statistics, data science, and elsewhere. Some chapters are application driven while others are solely focused on theory. The editors bookend the text with a solid overview and history of the literature at the beginning, to help newcomers navigate the rest of the handbook, and practical strategies at the end, to help a practitioner choose amongst the competing methods. … Overall, the handbook is a thorough reference for past and present work. It gives the reader a general overview of the field, which is of great value since the work crosses many disciplinary boundaries. The numerous clustering methods are organized to help researchers find the relevant chapters and references therein. …"
— Brianna C. Heggeseth, Williams College, in Journal of the American Statistical Association, July 2017
"After an overview of approaches and a quick journey through the history of Cluster analysis, the book focuses on the four major approaches to Cluster analysis. … This handbook is accessible to readers from various disciplines. …. All articles have a vast amount of hints to literature. So, the greatest benefit is that the interested reader can find the literature for her/his special clustering purpose."
—Rainer Schlittgen, University of Hamburg, Germany, in Statistical Papers, September 2016
"From the wide ranging ‘Handbooks of modern statistical methods’ series, this book seeks to be a non-exhaustive guide to the subject in a large and expanding field. The book is well laid out over 31 chapters each having its own introduction and conclusion, spanning the material in a logical manner aiding accessibility. Its main focus is on partitioning sets, and care is taken to explain the exploratory nature of the analysis in contrast with the predictive task of classification (i.e. it covers unsupervised rather than supervised classification)…This is a comprehensive reference guide, which is well organized and has an approachable style with many examples. Great care has been taken to provide an appropriate level of detail by using illustrative and topical examples. As an overview of an increasingly important field, it provides a vital first reference guide for a range of techniques and modelling considerations."
—Mark Pilling, University of Manchester, in Statistics in Society (Series A), October 2016
"This handbook is accessible to readers from various disciplines…All articles have a vast amount of hints to literature. So, the greatest benefit is that the interested reader can find the literature for her/his special clustering purpose."
—Statistical Papers, June 2016