I need to store a TfidfVectorizer for future use. Following this post, I did below -
tfidf_vect = TfidfVectorizer(analyzer='word', token_pattern=r'\w{1,}', max_features=5000)
pickle.dump(tfidf_vect, open("vectorizer.pickle", "wb"))
Then on a separate flask service, I do below
@app.route('/cuisine/api/json',methods=['POST'])
def getCuisine():
content=jsonify(request.json)
test = pd.io.json.json_normalize(request.json)
tfidf_vect = pickle.load(open("vectorizer.pickle", "rb"))
test['ingredients'] = [str(map(makeString, x)) for x in test['ingredients']]
test_transform = tfidf_vect.transform(test['ingredients'].values)
le = preprocessing.LabelEncoder()
X_test = test_transform
y_test = le.fit_transform(test['cuisine'].values)
But I am getting below error
sklearn.exceptions.NotFittedError: TfidfVectorizer - Vocabulary wasn't fitted.
Not sure what m I missing. Can anyone please suggest?