I was experimenting with some Python (2.7.3) regex and I came across this behavior which I did not expect. In this block of code here, the following will return False
when checking against the "ß" character as well as other accented characters like "Å", "Í", etc.
In addition to returning False
for the "ø" character, it will also return False
with other accented characters such as "å", "Å", "ç", "Ç", "Â", etc.
Case and point, I'm not sure where the problem stems from when dealing with accented characters versus other characters like "¥", which it has no problem with. They all have different unicode/utf-8 values (which is what my encoding is set to), so I'm not sure where the difference lies.
def regex_check(name)
pattern = '[^ß]'
if re.match(pattern, str(name), re.IGNORECASE):
return True
else:
return False
print regex_check("ø")
Am I missing something obvious? Thanks for the help.