Next: Robust -medoids
Up: Clustering Algorithms
Previous: Clustering Algorithms
  Contents
The -means algorithm is a very simple and very powerful iterative
technique to partition a data-set into disjoint clusters, where
the value has to be pre-determined [DH73,Har75]. A
generalized, similarity-based description of the algorithm can be
given as follows:
- Start at with randomly selected objects as
the cluster centers
,
.
- Assign each object
to the cluster center with maximum
similarity:
|
(2.1) |
- Update all cluster means:
|
(2.2) |
- If any
differs from
go to step 2 with
unless termination criteria (such as exceeding the maximum number of
iterations) are met.
When similarity is based on a strictly monotonic decreasing mapping of
Euclidean distances, -means greedily minimizes the sum of squared
distances of the samples
to the closest cluster
centroid
as given by equation
2.3.
|
(2.3) |
Next: Robust -medoids
Up: Clustering Algorithms
Previous: Clustering Algorithms
  Contents
Alexander Strehl
2002-05-03