I get a UserWarning thrown every time I execute this function. Here user_input is a list of words, and article_sentences a list of lists of words.
I've tried to remove all stop words out of the list beforehand but this didn't change anything.
def generate_response(user_input):
sidekick_response = ''
article_sentences.append(user_input)
word_vectorizer = TfidfVectorizer(tokenizer=get_processed_text, stop_words='english')
all_word_vectors = word_vectorizer.fit_transform(article_sentences) # this is the problematic line
similar_vector_values = cosine_similarity(all_word_vectors[-1], all_word_vectors)
similar_sentence_number = similar_vector_values.argsort()[0][-2]
this is a part of a function for a simple chatbot I found here: https://stackabuse.com/python-for-nlp-creating-a-rule-based-chatbot/
it should return a sorted list of sentences sorted by how much they match the user_input, which it does but it also throws this UserWarning: Your stop_words may be inconsistent with your preprocessing. Tokenizing the stop words generated tokens ['ha', 'le', 'u', 'wa'] not in stop_words
.