1

How can I convert this string bellow from Python3 to a Json?

This is my code:

import ast
mystr = b'[{\'1459161763632\': \'this_is_a_test\'}, {\'1459505002853\': "{\'hello\': 12345}"}, {\'1459505708472\': "{\'world\': 98765}"}]'
chunk = str(mystr)
chunk = ast.literal_eval(chunk)
print(chunk)

Running from Python2 I get:

[{'1459161763632': 'this_is_a_test'}, {'1459505002853': "{'hello': 12345}"}, {'1459505708472': "{'world': 98765}"}]

Running from Python3 I get:

b'[{\'1459161763632\': \'this_is_a_test\'}, {\'1459505002853\': "{\'hello\': 12345}"}, {\'1459505708472\': "{\'world\': 98765}"}]'

How can I run from Python3 and get the same result as Python2?

Iron Fist
  • 10,739
  • 2
  • 18
  • 34
waas1919
  • 2,365
  • 7
  • 44
  • 76

2 Answers2

3

What you have in mystr is in bytes format, just decode it into ascii and then evaluate it:

>>> ast.literal_eval(mystr.decode('ascii'))
[{'1459161763632': 'this_is_a_test'}, {'1459505002853': "{'hello': 12345}"}, {'1459505708472': "{'world': 98765}"}]

Or in a more general case, to avoid issues with unicodes characters,

>>> ast.literal_eval(mystr.decode('utf-8'))
[{'1459161763632': 'this_is_a_test'}, {'1459505002853': "{'hello': 12345}"}, {'1459505708472': "{'world': 98765}"}]

And since, default decoding scheme is utf-8 which you can see from:

>>> help(mystr.decode)
Help on built-in function decode:

decode(...) method of builtins.bytes instance
    B.decode(encoding='utf-8', errors='strict') -> str
...

Then, you don't have to specify the encoding scheme:

>>> ast.literal_eval(mystr.decode())
[{'1459161763632': 'this_is_a_test'}, {'1459505002853': "{'hello': 12345}"}, {'1459505708472': "{'world': 98765}"}]
Iron Fist
  • 10,739
  • 2
  • 18
  • 34
  • 1
    Good answer, I've added some information about the source of the confusion, namely the 'b' prefix, to an answer below. Feel free to edit that into your own answer if you think it will help other readers. – Tom Rees Apr 01 '16 at 10:48
  • 1
    @TomRees ... Let's keep it for your answer, you did effort for that, so you must be rewarded... ;) – Iron Fist Apr 01 '16 at 10:52
2

Iron Fist beat me to the fix. To extend his answer, the 'b' prefix on the string indicates (to python3 but not python2) that the literal should be interpreted as a byte sequence, not a string.

The result is that the .decode method is needed to convert the bytes back into a string. Python2 doesn't make this distinction between the bytes and strings, hence the difference.

See What does the 'b' character do in front of a string literal? for more information on this.

Community
  • 1
  • 1
Tom Rees
  • 455
  • 2
  • 10