I'm getting a strange behaviour running this code:
regex.search(ur'([^\p{IsAlnum}\s\.\'\`\,\-])', u'\U0001f618')
This should match \U0001f618
, which is the unicode representation of a kissing emoji. The result, however, is the following:
<regex.Match object; span=(0, 1), match=u'\ud83d'>
This doesn't make sense at all, because u'\ud83d'
is not even a valid unicode character.
I expected this instead:
<regex.Match object; span=(0, 1), match=u'\U0001f618'>
What is happening here?
I'm running Python 2.7.13 on macOS Sierra 10.12.6, regex.__version__
is 2.4.130.