I am using Python's re
module to capture all modifiers of word color
in Am. English (AmE) and Br. English (BrE). I successfully captured almost all words, with the exception of words that end with apostrophe. E.g. colors'
This problem is from Watt's Beginning Reg Exp book.
Here's sample text:
Red is a color.
His collar is too tight or too colouuuurful.
These are bright colours.
These are bright colors.
Calorific is a scientific term.
“Your life is very colorful,” she said.
color (U.S. English, singular noun)
colour (British English, singular noun)
colors (U.S. English, plural noun)
colours (British English, plural noun)
color’s (U.S. English, possessive singular)
colour’s (British English, possessive singular)
colors’ (U.S. English, possessive plural)
colours’ (British English, possessive plural)
Here's my regex: \bcolou?r(?:[a-zA-Z’s]+)?\b
Explanation:
\b # Start at word boundary
colou?r #u is optional for AmE
(?: #non-capturing group
[a-zA-Z’s]+ #color could be followed by modifier (e.g.ful, or apostrophe)
)? #End non-capturing group; these letters are optional
\b # End at word boundary
The issue is that colors’
and colours’
are matched until s
. Apostrophe is ignored. Can someone please explain what is wrong with my code? I researched this on SO Regex Apostrophe how to match?, and the problems there are about escaping '
and "
.
Here's Regex101
Thanks in advance.