1st Edition

Constrained Clustering Advances in Algorithms, Theory, and Applications

Edited By Sugato Basu, Ian Davidson, Kiri Wagstaff Copyright 2008
    470 Pages 110 B/W Illustrations
    by Chapman & Hall

    Since the initial work on constrained clustering, there have been numerous advances in methods, applications, and our understanding of the theoretical properties of constraints and constrained clustering algorithms. Bringing these developments together, Constrained Clustering: Advances in Algorithms, Theory, and Applications presents an extensive collection of the latest innovations in clustering data analysis methods that use background knowledge encoded as constraints.

    Algorithms

    The first five chapters of this volume investigate advances in the use of instance-level, pairwise constraints for partitional and hierarchical clustering. The book then explores other types of constraints for clustering, including cluster size balancing, minimum cluster size,and cluster-level relational constraints.

    Theory

    It also describes variations of the traditional clustering under constraints problem as well as approximation algorithms with helpful performance guarantees.

    Applications

    The book ends by applying clustering with constraints to relational data, privacy-preserving data publishing, and video surveillance data. It discusses an interactive visual clustering approach, a distance metric learning approach, existential constraints, and automatically generated constraints.

    With contributions from industrial researchers and leading academic experts who pioneered the field, this volume delivers thorough coverage of the capabilities and limitations of constrained clustering methods as well as introduces new types of constraints and clustering algorithms.

    Introduction
    Sugato Basu, Ian Davidson, and Kiri L. Wagstaff
    Semisupervised Clustering with User Feedback
    David Cohn, Rich Caruana, and Andrew Kachites McCallum
    Gaussian Mixture Models with Equivalence Constraints
    Noam Shental, Aharon Bar-Hillel, Tomer Hertz, and Daphna Weinshall
    Pairwise Constraints as Priors in Probabilistic Clustering
    Zhengdong Lu and Todd K. Leen
    Clustering with Constraints: A Mean-Field Approximation Perspective
    Tilman Lange, Martin H. Law, Anil K. Jain, and J.M. Buhmann
    Constraint-Driven Co-Clustering of 0/1 Data
    Ruggero G. Pensa, Céline Robardet, and Jean-François Boulicaut
    On Supervised Clustering for Creating Categorization Segmentations
    Charu Aggarwal, Stephen C. Gates, and Philip Yu
    Clustering with Balancing Constraints
    Arindam Banerjee and Joydeep Ghosh
    Using Assignment Constraints to Avoid Empty Clusters in k-Means Clustering
    A. Demiriz, K.P. Bennett, and P.S. Bradley
    Collective Relational Clustering
    Indrajit Bhattacharya and Lise Getoor
    Nonredundant Data Clustering
    David Gondek
    Joint Cluster Analysis of Attribute Data and Relationship Data
    Martin Ester, Rong Ge, Byron J. Gao, Zengjian Hu, and Boaz Ben-moshe
    Correlation Clustering
    Nicole Immorlica and Anthony Wirth
    Interactive Visual Clustering for Relational Data
    Marie desJardins, James MacGlashan, and Julia Ferraioli
    Distance Metric Learning from Cannot-Be-Linked Example Pairs with Application to Name Disambiguation
    Satoshi Oyama and Katsumi Tanaka
    Privacy-Preserving Data Publishing: A Constraint-Based Clustering Approach
    Anthony K.H. Tung, Jiawei Han, Laks V.S. Lakshmanan, and Raymond T. Ng
    Learning with Pairwise Constraints for Video Object Classification
    Rong Yan, Jian Zhang, Jie Yang, and Alexander G. Hauptmann
    References
    Index

    Biography

    Sugato Basu, Ian Davidson, Kiri Wagstaff

    From the Foreword
    “… this book shows how constrained clustering can be used to tackle large problems involving textual, relational, and even video data. After reading this book, you will have the tools to be a better analyst [and] to gain more insight from your data, whether it be textual, audio, video, relational, genomic, or anything else.”
    —Dr. Peter Norvig, Director of Research, Google, Inc., Mountain View, California, USA