I am new to this, so it would be helpful if someone could point me in right direction/help me with some tutorial. Given a sentence and a list of other sentences (English):
s = "This concept of distance is not restricted to two dimensions."
list_s = ["It is not difficult to imagine the figure above translated into three dimensions.", "We can persuade ourselves that the measure of distance extends to an arbitrary number of dimensions;"]
I want to compute pairwise cosine similarity between each sentence in the list and sentence s, then find the max value.
What i've got so far:
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf = TfidfVectorizer(norm='l2', min_df=0, use_idf=True, smooth_idf=False, sublinear_tf=True, tokenizer=tokenize)
bow_matrix = tfidf.fit_transform([s, ' '.join(list_s)])