next up previous contents
Next: Discussion Up: Similarity Measures for Document Previous: Extended Jaccard Similarity   Contents

Other (Dis-)Similarity Measures

Many other (dis-)similarity measures, such as mutual neighbor or edit distance, are possible [JMF99]. In fact, the ugly duckling theorem states [Wat69] the somewhat `unintuitive' fact that there is no way to distinguish between two different classes of objects, when they are compared over all possible features. As a consequence, any two arbitrary objects are equally similar unless we use domain knowledge. The similarity measures discussed above are the ones deemed pertinent to text documents [Sal89,FBY92] in previous studies.

Alexander Strehl 2002-05-03