2

I need to parse some simple JOSN in bash which contains non-ascii characters without external dependencies, so I used a python solution from this answer

cat $JSON_FILE  | python -c "import sys, json; print json.load(sys.stdin)['$KEY']"

This works for ascii values but other values throws this error:

'ascii' codec can't encode character u'\u2019' in position 1212: ordinal not in range(128)

Looking at this answer I think I need to cast to the unicode type, but I don't know how.

Community
  • 1
  • 1
Robert
  • 37,670
  • 37
  • 171
  • 213

1 Answers1

4

You already have unicode, but encoding when printing fails.

That's either because you don't have a locale set, have your locale set to ASCII, or you are piping the Python result to something else (but did not include that in your question). In the latter case Python refuses to guess what codec to use when connected to a pipe (it can use your terminal locale otherwise).

Set the PYTHONIOENCODING environment variable to a suitable codec; if your terminal uses UTF-8 for example:

cat $JSON_FILE  | PYTHONIOENCODING=UTF-8 python -c "import sys, json; print json.load(sys.stdin)['$KEY']"
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343