next up previous contents
Next: Vertex Weighted Graph Partitioning Up: OPOSSUM Previous: OPOSSUM   Contents


Balancing

Typically, one segments transactional data into 7-14 groups, each of which should be of comparable importance. Balancing avoids trivial clusterings (e.g., $ k-1$ singletons and 1 big cluster). More importantly, the desired balancing properties have many application driven advantages. For example when each cluster contains the same number of customers, discovered phenomena (e.g. frequent products, co-purchases) have equal significance / support and are thus easier to evaluate. When each customer cluster equals the same revenue share, marketing can spend an equal amount of attention and budget to each of the groups. OPOSSUM strives to deliver `balanced' clusters using either of the following two criteria: We formulate the desired balancing properties by assigning each object (customer, document, web-session) a weight and then softly constrain the sum of weights in each cluster. For sample balanced clustering, we assign each sample $ \mathbf{x}_j$ the same weight $ w_j = 1/n$. To obtain value balancing properties, a sample $ \mathbf{x}_j$'s weight is set to $ w_j = \frac{1}{v} \sum_{i=1}^d x_{i,j}$. Please note that the sum of weights for all samples is 1.
next up previous contents
Next: Vertex Weighted Graph Partitioning Up: OPOSSUM Previous: OPOSSUM   Contents
Alexander Strehl 2002-05-03