Hey I was doing an exercise in Kaggle and although I solved it correctly, I wanted to see the solution provided by Kaggle. Here:
def word_search(documents, keyword):
# list to hold the indices of matching documents
indices = []
# Iterate through the indices (i) and elements (doc) of documents
for i, doc in enumerate(documents):
# Split the string doc into a list of words (according to whitespace)
tokens = doc.split()
# Make a transformed list where we 'normalize' each word to facilitate matching.
# Periods and commas are removed from the end of each word, and it's set to all lowercase.
normalized = [token.rstrip('.,').lower() for token in tokens]
# Is there a match? If so, update the list of matching indices.
if keyword.lower() in normalized:
indices.append(i)
return indices
doc_list = ["The Learn Python Challenge Casino.", "They bought a car", "Casinoville"]
word_search(doc_list, 'casino')
I took the solution and changed 'in' in :
if keyword.lower() in normalized:
and changed it to :
if keyword.lower() == normalized:
and didn't get the right answer. My question is why? what's the difference between the two statements? If you follow the code, the idea is to find a certain keyword in a document. So, keyword == word in document.
(I can provide the exercise (context?) but I didn't it's important here as my question is a general one.)
Thanks.