Print nested iterable containing unicode emoji in python 3

Question

I'm currently learning to use the Facebook graph api to retrieve a list of posts from groups. I've figured out how to get a valid request in and I know the response is correct, but unfortunately it's hard to inspect because one of the posts contains emoji - which are only found in unicode and I do not know how to deal with them.

Here's the more detailed setup:

response = graph.get_object(id=GROUP_ID, fields="feed")

Which is supposed to return a dictionary with all the elements, which ends up being a multilayered iterable. A dictionary contains a dictionary contains a list of the posts which are dictionaries and one of the posts has a message body that contains emoji.

Attempting to print the dictionary for inspection gives

UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f605' in position 4323: character maps to

I've read another SO post that says I can pickle the dictionary; make it a string, then encode, then decode, then unpickle like so:

64_group = pickle.dumps(group).encode('base64', 'strict')
group = pickle.loads(utf8_group.decode('base64', 'strict'))

But that instead results in a new error (I tried to encode both utf-8 and base64, same result):

File "main.py", line 19, in <module>  
     utf8_group = pickle.dumps(group).encode('base64', 'strict')  
AttributeError: 'bytes' object has no attribute 'encode'

How can I safely inspect and later work with these request results when they could potentially be of any level of nesting, and the unicode characters could be in any, all, or none of the levels? Is there a way to 'sanitize' my dictionary after it is returned so I can work with it's levels like normal?

If it helps, my eventual goal is to pull these into a sqlite or mysql database and serve them through php (I don't know javascript).

@stovfl I think the response is already decoded, since the OP is experiencing a Unicode**En**codeError, ie. the opposite conversion (which probably happens in a subsequent step, when the response is being printed). — lenz, Jul 22 '17 at 20:40
@Austin It looks like your environment (command line, [power] shell, IDE...) is unable to print the emoticon (you work on Windows, right?). Try to reconfigure to an encoding like UTF-8 or UTF-16, otherwise switch to another tool to run your Python script. Or write to a file (opened with `encoding='utf8'`) instead of using `print()`. — lenz, Jul 22 '17 at 20:46
@lenz yes on all accounts. Writing to a file seemed to work just fine - I actually outputted it to html and then viewed in the browser. I guess the windows command line just doesn't support it, and that's a can of worms I'll open later when I need to get serious about picking a development environment. How do I mark your comment as an answer? — Austin, Jul 22 '17 at 22:00
`64_group = pickle.dumps(group).encode('base64', 'strict')` There's no way this is working, right? — Cory Madden, Jul 22 '17 at 23:58
@cory Yeah, turns out I didn't need pickle at all. Or any conversions of any kind now that I'm writing it out to a file. It's very readable as is. — Austin, Jul 23 '17 at 00:08
I'm glad you got it working, but I just meant the variable starting with a number. I just looked at the post you referenced before that and I'm guessing you just left off the `b` when you copy/pasted. I was just confused by the fact that an obvious syntax error wasn't breaking your script. — Cory Madden, Jul 23 '17 at 00:10
@Austin, you can't accept a comment – never mind. If my comment was enough to solve your problem, then there's no point in elaborating it into a nice answer. I think it would be unlikely to help future readers, because the problem statement is too specific. — lenz, Jul 23 '17 at 18:53

Print nested iterable containing unicode emoji in python 3

0 Answers0