Next:
List of Tables
Up:
Relationship-based Clustering and Cluster
Previous:
Relationship-based Clustering and Cluster
Contents
Contents
List of Tables
List of Figures
Introduction
Cluster Analysis
Notation
Relationship-based Clustering Approach
Current Challenges in Clustering
Contributions
Organization
Background and Related Work
Overview
Clustering Algorithms
The
-means Framework
Robust
-medoids
Agglomerative Nearest-neighbor Clustering
Artificial Neural Systems
Projective Techniques
Mixture Density Estimation
Recent Database-driven Approaches
Graph-based Clustering
Graph and Hypergraph Partitioning
Scalability
Visualization
Ensembles and Knowledge Reuse
Challenges
The Problem of Scale
Curse of Dimensionality
Clustering Objectives
Relationship-based Clustering and Visualization
Motivation
Domain Specific Features and Similarity Space
OPOSSUM
Balancing
Vertex Weighted Graph Partitioning
Determining the Number of Clusters
CLUSION: Cluster Visualization
Coarse Seriation
Visualization
Comparison
Experiments
Retail Market-basket Clusters
Web-document Clusters
Web-log Session Clusters
System Issues
Synergy between OPOSSUM and CLUSION
FASTOPOSSUM
Parallel Implementation
Summary
Impact of Similarity Measures
Motivation
Similarity Measures for Document Clustering
Conversion from a Distance Metric
Cosine Measure
Pearson Correlation
Extended Jaccard Similarity
Other (Dis-)Similarity Measures
Discussion
Algorithms
Random Baseline (RND)
Generalized
-means (KM)
Weighted Graph Partitioning (GP)
Hypergraph Partitioning (HGP)
Self-Organizing Map (SOM)
Other Clustering Methods
Evaluation Methodology
Internal (model-based, unsupervised) Quality
External (model-free, semi-supervised) Quality
Experiments on Text Documents
Data-sets and Preprocessing
Results
Summary
Cluster Ensembles
Motivation
The Cluster Ensemble Problem
Illustrative Example
Objective Function for Cluster Ensembles
Efficient Consensus Functions
Representing Sets of Clusterings as a Hypergraph
Cluster-based Similarity Partitioning Algorithm (CSPA)
HyperGraph Partitioning Algorithm (HGPA)
Meta-CLustering Algorithm (MCLA)
Discussion and Comparison
Cluster Ensemble Applications and Experiments
Data-sets
Evaluation Criterion
Robust Centralized Clustering (RCC)
Feature-Distributed Clustering (FDC)
Object-Distributed Clustering (ODC)
Summary
Concluding Remarks
Summary of Contributions
Future Work
Greedy Maximization for Cluster Ensembles
Soft Cluster Ensembles
Feature-augmented Cluster Ensembles
Distributed Clustering
Bioinformatics
Data-sets
Gaussian Data-sets
Iris Data-set
Pen Digits Data-set
Drugstore Data-set
Yahoo! News Data-set
20 Newsgroup Data-set
Derivations
Normalized Symmetric Mutual Information
Normalized Asymmetric Mutual Information
Bibliography
Author Vita
About this document ...
Alexander Strehl
2002-05-03