I'm trying to convert an xml with Hebrew characters in it, to json.
here's a piece of the xml:
<Root><City>ירושלים</City></Root>
I've tried several methods and encodings, but I always get this unconvertable ascii code instead of Hebrew characters.:
\u05e7\u05e0\u05d9\u05d5\u05df
Here's my conversion script:
import xmltodict, json
import sys
infile = open('text.xml','r')
outfile = open('text.json','w')
xmltxt = infile.read().decode('utf_16_le')
decodedXML = xmltodict.parse(xmltxt)
jsontext = json.dumps(decodedXML, encoding='utf_8')
outfile.write(jsontext.encode('utf_8'))
infile.close()
outfile.close()
Thank you
EDIT - Solution: Solved by adding 'ensure_ascii=False' to json.dumps()