I have pattern like this (finding 3 word abbreviations)
s='([A-Z][a-z]+ ){2,4}\([A-Z]{2,4}\)'
and I want to find
line='National Health Service (NHS)'
p=re.findall(s,line)
but p is only ['Service '] and not the whole string. Why?
I have pattern like this (finding 3 word abbreviations)
s='([A-Z][a-z]+ ){2,4}\([A-Z]{2,4}\)'
and I want to find
line='National Health Service (NHS)'
p=re.findall(s,line)
but p is only ['Service '] and not the whole string. Why?
You are not grouping the match correctly, use this instead:
s='(?:[A-Z][a-z]+ ){2,4}\([A-Z]{2,4}\)'
.findall()
returns the whole match, unless you define capturing groups ((...)
), at which point it'll return the results contained in the group instead. The above pattern uses a non-capturing group instead ((?:...)
). Since that leaves your expression without any capturing groups, .findall()
returns full matches again.