next up previous contents
Next: Graph and Hypergraph Partitioning Up: Clustering Algorithms Previous: Recent Database-driven Approaches   Contents

Graph-based Clustering

ROCK (Robust Clustering using linKs) [GRS99] is an agglomerative hierarchical clustering technique for categorical attributes. It uses the binary Jaccard coefficient and a thresholding criterion to establish links between samples. The links are unweighted edges in a graph with vertices corresponding to the objects to be clustered. Common neighbors are used to define inter-connectivity of clusters which is used to merge clusters. Another key idea in ROCK is to define a transitive neighbor relationship. Instead of only using the simple links (adjacency matrix $ \mathbf{A}$), all pairs with a common neighbor are linked as well (using $ \mathbf{A} \mathbf{A}$). However, we feel that this does not really add any new information and a good clustering algorithm would not be improved by this modification of the adjacency matrix. In adds redundancy by adding more links and hence the desired sparsity reduced.

CHAMELEON [KHK99] starts with partitioning the data into a large number of clusters by partitioning the $ v$-nearest neighbor graph. In the subsequent stage clusters are merged based on inter-connectivity and their relative closeness.


next up previous contents
Next: Graph and Hypergraph Partitioning Up: Clustering Algorithms Previous: Recent Database-driven Approaches   Contents
Alexander Strehl 2002-05-03