I have JSON file which contains followingly encoded strings:
"sender_name": "Horn\u00c3\u00adkov\u00c3\u00a1",
I am trying to parse this file using the json
module. However I am not able to decode this string correctly.
What I get after decoding the JSON using .load()
method is 'HornÃ\xadková'
. The string should be correctly decoded as 'Horníková'
instead.
I read the JSON specification and I understasnd that after \u
there should be 4 hexadecimal numbers specifing Unicode number of character. But it seems that in this JSON file UTF-8 encoded bytes are stored as \u
-sequences.
What type of encoding is this and how to correctly parse it in Python 3?
Is this type JSON file even valid JSON file according to the specification?