-3

How might I specify a regular expression in Python 3 such that it matches occurences of the word "verb" only outside of words that contain it?

For example in the sentence: "The man ADVERB VERB and VERB until he was ready."

The regular expression would match the two occurences of the pure word "VERB", but not the occurence of "VERB" in the word "ADVERB".

user366818
  • 121
  • 4

1 Answers1

3

You can use the meta escape sequence \b for a word boundary:

re.findall(r'\bVERB\b', "The man ADVERB VERB and VERB until he was ready.")

There's no need to use a lookaround, as \b matches the adjacency (not any actual character, i.e. zero-width) between a word character and a non-word character.

You can read more about the \b metacharacter here: https://www.regular-expressions.info/wordboundaries.html

iBug
  • 35,554
  • 7
  • 89
  • 134
  • Thank you, I had no idea this was a thing, I had only learnt about \s, \d, \w, \S, \S, \W, so I was trying to come up with a workaround with these. Sorry for the silly question! – user366818 Aug 26 '18 at 10:09