next up previous contents
Next: Extended Jaccard Similarity Up: Similarity Measures for Document Previous: Cosine Measure   Contents

Pearson Correlation

In collaborative filtering, correlation is often used to predict a feature from a highly similar mentor group of objects whose features are known. The [0,1]-normalized Pearson correlation is defined as

$\displaystyle s^{(\mathrm{P})} (\mathbf{x}_a,\mathbf{x}_b) = \frac{1}{2} \left(...{x}_a \Vert _2 \cdot \Vert \mathbf{x}_b - \bar{x}_b \Vert _2 } + 1 \right) ,$ (4.3)

where $ \bar{x}$ denotes the average feature value of $ \mathbf{x}$ over all dimensions. Note that this definition of Pearson correlation tends to give a full matrix. Other important correlations have been proposed, such as Spearman correlation [Spe06] which works well on rank orders.

Alexander Strehl 2002-05-03