3

I have a list of dishes and a sentence.

I want to check if a dish is present in the sentence. However, if I do a normal

if dish in sentence:

I will basically get a substring matching. My problem is that say I have a

dish='puri'
sentence='the mutton shikampuri was good'

the above code still matches puri with shikampuri which i don't want.

If I try tokenizing the sentence, I will not be able to match dishes like dish='puri bhaji'

Is there any way I can ignore the matches which don't begin with my dish string? Basically, I want to ignore patterns like 'shikampuri' when my dish is 'puri'.

pd176
  • 821
  • 3
  • 10
  • 20

2 Answers2

5

What you need is re.search with \b.

import re
if re.search(r"\b"+dish+r"\b",sentence):
vks
  • 67,027
  • 10
  • 91
  • 124
  • This was what I needed. Thanks :) On a side note, what does the 'r' in r"\b" represent? – pd176 Dec 03 '15 at 17:33
  • @pd176: [What does the “r” in pythons re.compile(r' pattern flags') mean?](http://stackoverflow.com/questions/21104476/what-does-the-r-in-pythons-re-compiler-pattern-flags-mean) – GingerPlusPlus Dec 03 '15 at 18:52
2

You could write:

if dish in sentence.split():

This splits sentence by space into a list and looks for a dish in the list of words.

itsafire
  • 5,607
  • 3
  • 37
  • 48
  • It's a nice, clean one-liner. I like it. – McGlothlin Dec 03 '15 at 17:30
  • Yes. But this solution will not be able to find strings with spaces like e.g. dish='puri bhaji'. I missed that specification on answering. So the regex solution is the better choice here. – itsafire Dec 04 '15 at 07:47