I want to get the count of a word in a given sentence using only tf*idf matrix of a set of sentences. I use TfidfVectorizer from sklearn.feature_extraction.text.
Example :
from sklearn.feature_extraction.text import TfidfVectorizer
sentences = ("The sun is shiny i like the sun","I have been exposed to sun")
vect = TfidfVectorizer(stop_words="english",lowercase=False)
tfidf_matrix = vect.fit_transform(sentences).toarray()
I want to be able to calculate the number of times the term "sun" occurs in the first sentence (which is 2) using only tfidf_matrix[0] and probably vect.idf_ . I know there are infinite ways to get term frequency and words count but I have a special case where I only have a tfidf matrix. I already tried to divide the tfidf value of the word "sun" in the first sentence by its idf value to get tf. Then I multiplied tf by the total number of words in the sentence to get the words count. Unfortunately, I get wrong values.