Tuesday, 6 August 2013

Incremental Document Similarity Algorithm

Incremental Document Similarity Algorithm

I'm trying to calculate similarity between a large, dynamic set of text
documents. For static sets, something like cosine similarity + tf-idf
would work great. However, I'm looking for a scheme that will allow me to
add a new document without recalculating the entire similarity set. Does
any such algorithm exist?

No comments:

Post a Comment