Incremental Document Similarity Algorithm
I'm trying to calculate similarity between a large, dynamic set of text
documents. For static sets, something like cosine similarity + tf-idf
would work great. However, I'm looking for a scheme that will allow me to
add a new document without recalculating the entire similarity set. Does
any such algorithm exist?
No comments:
Post a Comment