-1

I am unable to filter out a specific word in a line using python re module.

Suppose I want to match every word except "cat" in a line, the following code does not work:

re.search("(?!cat)", "a black cat is scary")

Please help.

Ismael Padilla
  • 5,246
  • 4
  • 23
  • 35

3 Answers3

0

You need to set what to actually search for. Remember, computers will do what we tell them to do and nothing else.

If you're looking to buy all socks in a store except black coloured ones, you go up to them and say "I want all your socks except black coloured ones."

What you did was essentially say "I don't want black coloured socks"

re.search("(?!cat\b)\b\w+", "a black cat is scary")

0

the problem is in the regular expression basically you're telling it to find the place where cat can't be found ie |a| |b|l|a|c|k| c|a|t| |i|s| |s|c|a|r|y| (pipes to show where the regex engine will stop) you need to change the regular expression to \b(?!cat\b)\w+ where:

  • \b assert position at a word boundary.
  • \w matches any word character (equal to [a-zA-Z0-9_])
  • (?!cat\b) Negative Lookahead match when next characters are not cat{endofword}

this regular expression will match cat but not catastrophe. the result for running the regex on a black cat is a catastrophe |a |black cat |is |a |catastrophe

EDIT : the call failed because python's default behaviour is to treat \b as a backspace like the other special characters like \n \t \r.

the call needs to be re.search(r"\b(?!cat\b)\w+", "a black cat is a catastrophe"). And if you want to get all the matches as a list use the re.findall function

you can find the results in here

Abdessabour Mtk
  • 3,895
  • 2
  • 14
  • 21
0

You need to use the re.sub method instead

re.sub(r"cat ", "",  "a black cat is scary") # a black is scary
tbhaxor
  • 1,659
  • 2
  • 13
  • 43