0

I was coding a discord bot and realized I had difficulty parsing messages. I ended up using a double for loop (yuck). What can I do to optimize this code? (this is a far more straightforward version of the code)

string = "His name is food"
list = ["food", "numbers"]
parsed_string = string.split(" ")
print(parced_string)

for i in parsed_string:
  for x in list:
    if i == x:
      print("stop")

How do I optimize this bit of code?

  • If the string is `food is food` this will print `stop` twice. Do you want that? If not, use the `any()` function. – Barmar Nov 08 '22 at 21:55
  • I think storing list of words in a hashmap would be beneficial so that your look up time is O(1). https://stackoverflow.com/questions/37350450/why-is-a-list-access-o1-in-python Your current complexity is O(nm) where n is the length of the string and m is the number of words in the list. If you remove the lookup of the list O(m) with the lookup of the hashmap O(1), you would have O(n) which is an improvement. – kthnd Nov 08 '22 at 21:55

3 Answers3

1
string = "His name is food"
mylist = ["food", "numbers"]

if set(string.split()).intersection(mylist):
    print("stop")

or

if not set(string.split()).isdisjoint(mylist):
    print("stop")
9769953
  • 10,344
  • 3
  • 26
  • 37
  • You don't need the second `set()`, since `intersection()` will automatically convert its argument to a set. – Barmar Nov 08 '22 at 21:56
  • also, this can just be: `if not set(string.split()).isdisjoint(mylist): ...` which to me is a bit more readable. – juanpa.arrivillaga Nov 08 '22 at 21:57
  • @juanpa.arrivillaga See the note I just added. Both solutions are valid, and I think both require scanning both sets completely. – 9769953 Nov 08 '22 at 21:58
  • Well, theoretically, `isdisjoint` could exit early if it finds any item in `mylist` that is in the set. Not sure if it is implemented taht way, but I suspect it is – juanpa.arrivillaga Nov 08 '22 at 21:59
  • Ah, true, of course. I thought the other way: testing for isdisjoint to be true requires a full scan. Breaking early when false is faster. – 9769953 Nov 08 '22 at 22:00
1

For a long list of keywords, prefer a set.

text = "His name is food"
words = {"food", "numbers"}

for word in text.split(' '):
  if word in words:
     print("stop")
0
string = "His name is food"
list = ["food", "numbers"]
parsed_string = string.split(" ")
print(parsed_string)

for i in parsed_string:
  if i in list:
     print("stop")
ddastrodd
  • 710
  • 1
  • 10
  • 22