Looping over characters of unicode string showing unexpected code point

Asked Oct 14 '17 at 18:30

Active Oct 14 '17 at 18:35

Viewed 43 times

For example:

x = u'\U0001f622'
[c for c in x]

returns [u'\ud83d', u'\ude22'] instead of the expected [u'\U0001f622'].

\U0001f622 is an emoji, and since I have an emoji dict with which Im attempting to detect emojis in text, I need the looping to capture \U0001f622 and not the combo u'\ud83d', u'\ude22'.

I'm using python 2.7. According to this u'\ud83d', u'\ude22' is the C/C++/Java source code of that emoji.

How do I do that? Thanks.

edited Oct 14 '17 at 18:35

asked Oct 14 '17 at 18:30

idoda

6,248
10
39
52

In python 3.5, its returning expected result `['\U0001f622']`, what version are you running? – Stack Oct 14 '17 at 18:32
2.7.. I'm editing the question – idoda Oct 14 '17 at 18:34
Yes, It's a duplicate indeed. But that question was not answered.. – idoda Oct 14 '17 at 18:38
@idoda: [The answer](https://stackoverflow.com/a/46714847/190597) is to use a wide python build or Python3.3 or newer. Is there some other aspect to the question that was not answered? – unutbu Oct 14 '17 at 18:40
Yes, you are right. I'll do that, thanks – idoda Oct 14 '17 at 18:41
@unutbu How is it done? The installation of a wider version of python 2.7? Can find it. – idoda Oct 14 '17 at 21:02
Are you using Windows? – unutbu Oct 14 '17 at 21:07
@unutbu no I'm using Mac. I saw thar rebuilding python on mac might be unsafe. I'll just change the code or switch to python 3 (It's about time anyway) – idoda Oct 14 '17 at 21:13
While it is possible to [compile a wide build python2.7 on OSX](https://stackoverflow.com/a/25112348/190597) you would probably benefit more (in the long run) by switching to python3. (Note though, it may take effort to [port your code base](https://docs.python.org/3/howto/pyporting.html)). – unutbu Oct 14 '17 at 21:23

Looping over characters of unicode string showing unexpected code point

0 Answers0