1

I'm trying to store translations of spanish words as json. But the process of converting back and forth between python dictionaries and json strings is messing up my data.

Here's the code:

import json 

text={"hablar":"reden"}
print(text)        # {'hablar': 'reden'}

data=json.dumps(text)
text=json.loads(data)

print(text)        # {u'hablar': u'reden}

Why has the letter "u" been added ?

lhk
  • 27,458
  • 30
  • 122
  • 201

3 Answers3

2

Strings in JSON are loaded into unicode strings in Python.

If you print a dictionary you are printing repr() form of its keys and values. But the string itself still contains only reden.

>>> print(repr(text["hablar"]))
u'reden'
>>> print(text["hablar"])
reden

Everything should be OK. Unicode is the preferred way how to work with "human-readable" strings. JSON does not natively support binary data, so parsing JSON strings into Python unicode makes sense.

You can read more about Unicode in Python here: http://docs.python.org/2/howto/unicode.html

In Python source code, Unicode literals are written as strings prefixed with the ‘u’ or ‘U’ character: u'abcdefghijk'.

Messa
  • 24,321
  • 6
  • 68
  • 92
1

json.loads reads strings as unicode1. In general, this shouldn't hurt anything.

1The u is only present in the unicode object's representation -- try print(text['hablar']) and no u will be present.

mgilson
  • 300,191
  • 65
  • 633
  • 696
  • great, thanks for the quick answer. ill accept Messa since he has been a little more elaborate. – lhk Feb 13 '14 at 16:09
0

The u character meaning is this string is unicode so you can use yaml instead of json so the code will be:

import yaml 

text={"hablar":"reden"}
print(text)        # {'hablar': 'reden'}

data=yaml.dump(text)
text=yaml.load(data)

print(text)        # {'hablar': 'reden}

This question have more details about yaml: What is the difference between YAML and JSON? When to prefer one over the other

Community
  • 1
  • 1