Why are 'u' prefixes still printing even though I used str()?

Question

New to Python, Python version: 2.7.10, Machine: Mac OS Sierra.

Susi Sushanti Don $ python -c "import sys, json; print(json.load(open('/tmp/2.json'))['pages'])"
{u'giga-10': [u'overview']}

Susi Sushanti Don $ python -c "import sys, json; print(str(json.load(open('/tmp/2.json'))['pages']))"
{u'giga-10': [u'overview']}

Why is python still printing u character even though I used str()? I read in other post that using string will not print it in the std output. Is there any similar str() function which can work on any Python data object (rather than writing a reusable function myself)?

I'm expecting the output to be just {'giga-10': ['overview']}

The `str` here does absolutely *nothing*. `print` implicitly uses `str` — juanpa.arrivillaga, Aug 17 '17 at 22:12
Also, "using string will not print it in the std output" is not true. I'm not sure what you read, but it was either mistaken or you misunderstood it. — juanpa.arrivillaga, Aug 17 '17 at 22:15
1. Are you sure you have python linked to python2.7 and not python3? 2. Do you have `__init__.py` file in your current folder? — Dekel, Aug 17 '17 at 22:16
Possible duplicate of [Python: json.loads returns items prefixing with 'u'](https://stackoverflow.com/questions/13940272/python-json-loads-returns-items-prefixing-with-u) — Dekel, Aug 17 '17 at 22:19
specifically check this answer: https://stackoverflow.com/a/13940357/5037551 (which I'm not sure why isn't marked as the right one) — Dekel, Aug 17 '17 at 22:20
It's because the `repr()` of the _contents_ of the result, each item in the dictionary, are being `print`ed—rather than the string of each one. — martineau, Aug 17 '17 at 22:31

score 2 · Answer 1 · answered Aug 17 '17 at 22:31

It is still printing u because there are elements in your object which are unicode objects, and that is how python prints objects that contain unicode objects.

>>> x = {u'giga-10': [u'overview']}
>>> print x
{u'giga-10': [u'overview']}

It makes sense that you have unicode objects, since you are deserializing JSON, and the appropriate data-structure that corresponds to JSON string is a Python 2 unicode object.

Note, if you print a unicode object, it doesn't print the u, since the u isn't actually a part of the unicode string:

>>> print u"hello"
hello

This really doesn't matter. You should just let it stop bothering you. But if you insist, for some crazy reason, to want to get rid of those us, then you have to convert any unicode objects inside an arbitrary object deserialized from JSON to str types. That requires decoding the unicode object. As long as you aren't providing any hooks, the following should work for any result of json.load:

>>> def stringify(obj):
...     if isinstance(obj, unicode):
...         return obj.encode('utf8')
...     elif isinstance(obj, list):
...         return [stringify(x) for x in obj]
...     elif isinstance(obj, dict):
...         return {stringify(k):stringify(v) for k,v in obj.iteritems()}
...     else:
...         return obj
...
>>> print stringify(x)
{'giga-10': ['overview']}

But there is no good reason to do this, unless you really, truely, need Python 2 str, i.e. "byte-strings". You almost certainly do not, or at least, haven't indicated any reason why you would.

AKS · Accepted Answer · 2017-08-18T18:16:28.430

1

For this, why not use jq JQ utility for a one liner.

You can achieve the:

$ echo `jq ".pages" /tmp/2.json`
{ "giga-10": [ "overview" ] }

Don't forget to check out this URL: https://jqplay.org/ it really helped me learn / watch how jq will play with the input data.

edited Aug 18 '17 at 18:16

answered Aug 17 '17 at 22:42

AKS

16,482
43
166
258

I like this idea of using jq – shanti Aug 17 '17 at 22:44
Aside from having nothing to do with the question, using `echo` with a command substitution is completely pointless. Just use `jq ".page" /tmp/2.json`. There is no reason to capture the standard output of `jq` just to immediately write it back to standard output. – chepner Aug 20 '17 at 17:14

score 1 · Answer 3 · answered Aug 18 '17 at 18:31

If you are going to use python, you probably want json.dumps(). e.g

$ cat data.json
{"pages": {"giga-10": ["overview"]}}

$ python -c 'import sys, json; x = json.load(open(sys.argv[1])); print json.dumps(x["pages"])' data.json
{"giga-10": ["overview"]}

Why are 'u' prefixes still printing even though I used str()?

3 Answers3