I want to use the re
package in Python to search for some text in a big data set. Maybe I got something wrong, but if I'm using
(\w+,\s?)+
I would get a match for something like:
This, is, a, test,
Why isn't this the case in Python?
The following example works only with [] instead of ()
str = 'St. aureus°, unimportant_stuff, Strep. haemol.°'
will_fail = re.compile(r'(\w+\.?\s?)+°')
success = re.compile(r'[\w+\.?\s?]+°')
print(will_fail.findall(str))
print(success.findall(str))
This will result in the output:
['aureus', 'haemol.'] // THIS IS FAIL
['St. aureus°', 'Strep haemol.°'] // THIS IS OK
What am I doing wrong here?