** Next:** Robust Centralized Clustering (RCC)
** Up:** Cluster Ensemble Applications and
** Previous:** Data-sets
** Contents**

##

Evaluation Criterion

Evaluation of the quality of a clustering is a non-trivial and often
ill-posed task. In fact, many definitions of objective functions for
clusterings exist [JD88]. *Internal* criteria formulate
quality as a function of the given data and/or similarities. For
example, the mean squared error criterion (for -means) and other
measures of compactness are popular evaluation criteria. Measures
can also be based on isolation such as the min-cut criterion, which
uses the sum of edge weights across clusters (for graph partitioning).
When using internal criteria, clustering becomes an optimization
problem, and a clusterer can evaluate its own performance and tune its
results accordingly.

*External* criteria on the other hand impose
quality by additional, external information not given to the
clusterer, such as category labels. This is sometimes more appropriate
since groupings are ultimately evaluated externally by humans.
For example, when objects have already been
categorized by an external source, i.e. when class labels are
available, we can use information theoretic measures to quantify
the match between the categorization and the clustering. Previously,
average purity and entropy-based measures to assess clustering quality
from 0 (worst) to 1 (best) have been used [BGG$^+$99]. While
average purity is intuitive to understand, it is biased to favor small
clusters. Singletons, in fact, are scored as perfect. Also, a
monolithic clustering (one single cluster for all objects) receives a
score as high as the maximum category prior probability. In
unstratified data-sets, this prior might be close to 1, while the
desired value should be close to 0. An entropy-based measure is better
than purity but is still biased towards small clusters (e.g., a set of
singletons is always considered perfect).

Using a measure based on
normalized mutual information (equation
5.2) fixes both biases
(see chapter 4): Monolithic clusterings are evaluated at 0 and
singletons are severely discounted compared to the best
clustering.^{5.9} However, the mutual
information for the best clustering is smaller than 1 unless all
categories have equal prior probabilities. Thus, mutual information
provides an unbiased and conservative measure as compared to purity
and entropy. Given categories (or classes), we use the
categorization labels
to evaluate the cluster quality using
mutual information
, as
defined in equation 5.2.

** Next:** Robust Centralized Clustering (RCC)
** Up:** Cluster Ensemble Applications and
** Previous:** Data-sets
** Contents**
Alexander Strehl
2002-05-03