As an alternative to Python's re
module, you can do this explicitly with the regex
library, which supports set operations for character classes:
The operators, in order of increasing precedence, are:
||
for union (“x||y”
means “x or y”)
~~
(double tilde) for symmetric difference (“x~~y”
means “x or y, but not > both”)
&&
for intersection (“x&&y”
means “x and y”)
--
(double dash) for difference (“x––y”
means “x but not y”)
So to match only consonants, your regular expression could be:
>>> regex.findall('[[a-z]&&[^aeiou]]+', 'abcde', regex.VERSION1)
['bcd']
Or equivalently using set difference:
>>> regex.findall('[[a-z]--[aeiou]]+', 'abcde', regex.VERSION1)
['bcd']