-7

Hey I was doing an exercise in Kaggle and although I solved it correctly, I wanted to see the solution provided by Kaggle. Here:

def word_search(documents, keyword):
# list to hold the indices of matching documents
indices = [] 
# Iterate through the indices (i) and elements (doc) of documents
for i, doc in enumerate(documents):
    # Split the string doc into a list of words (according to whitespace)
    tokens = doc.split()
    # Make a transformed list where we 'normalize' each word to facilitate matching.
    # Periods and commas are removed from the end of each word, and it's set to all lowercase.
    normalized = [token.rstrip('.,').lower() for token in tokens]
    # Is there a match? If so, update the list of matching indices.
    if keyword.lower() in normalized:
        indices.append(i)
return indices

doc_list = ["The Learn Python Challenge Casino.", "They bought a car", "Casinoville"]
word_search(doc_list, 'casino')

I took the solution and changed 'in' in :

if keyword.lower() in normalized:

and changed it to :

if keyword.lower() == normalized:

and didn't get the right answer. My question is why? what's the difference between the two statements? If you follow the code, the idea is to find a certain keyword in a document. So, keyword == word in document.

(I can provide the exercise (context?) but I didn't it's important here as my question is a general one.)

Thanks.

  • 1
    `normalized` is a `list`, `keyword` is a `str`. That should say it all…!? In case it doesn't: *string equals list* and *string in list* are clearly two different things…!? – deceze Dec 05 '18 at 07:38
  • "==" means is equal to,while "in" means contain – Tobias Wilfert Dec 05 '18 at 07:39
  • 3
    `in` checks for membership. `==` checks for value equality. The confusion to have is that between `is` and `==` which is explained in several places as for example [here](https://stackoverflow.com/questions/15008380/double-equals-vs-is-in-python) – Ma0 Dec 05 '18 at 07:39

4 Answers4

1

The first statement if keyword.lower() in normalized: is checking if keyword.lower() string is one of the elements inside the list normalized. This is True.

The other statement if keyword.lower() == normalized: is checking if keyword.lower() string has same value as normalized list. This is False.

Kamal
  • 2,384
  • 1
  • 13
  • 25
  • Right! I guess the reason why I didn't get it right away is because when I solved it myself I put the if condition in a for loop in that list, like this : for word in normalized: if keyword.lower() == word: indices.append(i) – Motlaq AlMutairi Dec 05 '18 at 08:07
0

The "in" keyword tests for membership. I don't quite understand your variables, but I assume that what you want to find is if the "keyword" variable is "in" the normalized list. Using "==" here would be as if to say, does the "keyword" variable equal the normalized list variable (which, if your keyword is a string, and your normalized list is a list, then it obviously isn't)

albert
  • 1,158
  • 3
  • 15
  • 35
0

Because normalized is a list, where keyword.lower() is a string, that's a difference already, a string can't be equivalent to a list, this == operator checks if something equals to another thing, whereas the in operator checks if something contains another thing, demo:

>>> a=4
>>> b=4
>>> a==b
True
>>> a in b
Traceback (most recent call last):
  File "<pyshell#9>", line 1, in <module>
    a in b
TypeError: argument of type 'int' is not iterable
>>> a=4
>>> b=[1,4]
>>> a==b
False
>>> a in b
True
>>> 
Eric Aya
  • 69,473
  • 35
  • 181
  • 253
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
0

Using == gives you True only if there is an exact match between both elements and they have the same dtype

if keyword.lower() == normalized:

Here, the keyword.lower() #String# is not an exact match of normalized #list#

Using in will do a more relaxed search where left element can be anywhere in the right element

if keyword.lower() in normalized:

Here if the keyword.lower() is found anywhere in normalized, it will return True.

ParvBanks
  • 1,316
  • 1
  • 9
  • 15