1

I am sure that the problem I am experiencing isn't directly related to the OS or version, but instead to some kind of setup. In a python django app I am writing JSON to a file which can contain characters from other languages like 我, ל, and は. At no point in time in the flow of data am I changing the encoding as far as I am aware.

During local development, this was not a problem.

    with open(self._json_path, 'w') as f:
        json.dump(test_dict, f, indent=2, ensure_ascii=False)
    answer, wordid, question = self._unpack_dict(test_dict)

Once I deployed to the live web server, I began getting:

'ascii' codec can't encode character '\u90fd' in position 1: ordinal not in range(128)

I know for a fact that the data in test_dict is encoded properly. As soon as the json.dump occurs, it errors. If I open the file that was created, it fails at the very first non-latin character I put into it.

I've been through this post, but couldn't sort out the problem. Adding , encoding='utf8' causes the output of the above code to create a file but put nothing in it. Again, I know for a fact that test_dict has data as the data is displaying on the web page properly. **Answer to blank file: ** in troubleshooting I switched from dump to dumps which caused the files to be generated but not filled. **Answer to main problem: ** encoding='utf8' is not correct, it is encoding='utf-8'

I've also tried rebuilding the virtual environment.

On the server here are some results:

echo $LANG
en_US.UTF-8
python -c "import sys; print(sys.stdout.encoding)"
UTF-8
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Local environment:

echo $LANG
en_US.UTF-8
python -c "import sys; print(sys.stdout.encoding)"
UTF-8
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

The two most notable difference are:

Server: Python 3.5

Local environment: Python 3.7.2

apache2/error.log

[Fri Feb 15 09:49:15.899753 2019] [wsgi:error] [pid 2489] GETTING NEW TEST
[Fri Feb 15 09:49:15.957897 2019] [wsgi:error] [pid 2489] Saving the test dictionary
[Fri Feb 15 09:49:16.006470 2019] [wsgi:error] [pid 2489] Internal Server Error: /test/addohm/
[Fri Feb 15 09:49:16.006506 2019] [wsgi:error] [pid 2489] Traceback (most recent call last):
[Fri Feb 15 09:49:16.006509 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/exception.py", line 34, in in$
[Fri Feb 15 09:49:16.006511 2019] [wsgi:error] [pid 2489]     response = get_response(request)
[Fri Feb 15 09:49:16.006514 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 126, in _get_r$
[Fri Feb 15 09:49:16.006517 2019] [wsgi:error] [pid 2489]     response = self.process_exception_by_middleware(e, request)
[Fri Feb 15 09:49:16.006520 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 124, in _get_r$
[Fri Feb 15 09:49:16.006522 2019] [wsgi:error] [pid 2489]     response = wrapped_callback(request, *callback_args, **callback_kwargs)
[Fri Feb 15 09:49:16.006525 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/views.py", line 125, in test
[Fri Feb 15 09:49:16.006531 2019] [wsgi:error] [pid 2489]     form = TestForm(wordsdict)
[Fri Feb 15 09:49:16.006533 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/forms.py", line 21, in __init__
[Fri Feb 15 09:49:16.006536 2019] [wsgi:error] [pid 2489]     json.dumps(test_dict, f, indent=2, ensure_ascii=False)
[Fri Feb 15 09:49:16.006538 2019] [wsgi:error] [pid 2489]   File "/usr/lib/python3.5/json/__init__.py", line 179, in dump
[Fri Feb 15 09:49:16.006541 2019] [wsgi:error] [pid 2489]     fp.write(chunk)
[Fri Feb 15 09:49:16.006545 2019] [wsgi:error] [pid 2489] UnicodeEncodeError: 'ascii' codec can't encode character '\\u7ea6' in position 1: ordinal not in range(128)
[Fri Feb 15 09:49:16.006550 2019] [wsgi:error] [pid 2489]
[Fri Feb 15 09:51:36.924025 2019] [wsgi:error] [pid 2627] [client 124.9.54.252:55825] Timeout when reading response headers from daemon process 'duotool.addohm.net': /var/www/django/du$
[Fri Feb 15 09:51:43.283083 2019] [wsgi:error] [pid 2631] [client 124.9.54.252:55971] Truncated or oversized response headers received from daemon process 'duotool.addohm.net': /var/ww$
[Fri Feb 15 09:51:43.283324 2019] [wsgi:error] [pid 2630] [client 124.9.54.252:55888] Truncated or oversized response headers received from daemon process 'duotool.addohm.net': /var/ww$
[Fri Feb 15 09:53:18.572055 2019] [wsgi:error] [pid 2489] GETTING NEW TEST
[Fri Feb 15 09:53:18.631474 2019] [wsgi:error] [pid 2489] Saving the test dictionary
[Fri Feb 15 09:53:18.675315 2019] [wsgi:error] [pid 2489] Internal Server Error: /test/addohm/
[Fri Feb 15 09:53:18.675335 2019] [wsgi:error] [pid 2489] Traceback (most recent call last):
[Fri Feb 15 09:53:18.675338 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/exception.py", line 34, in in$
[Fri Feb 15 09:53:18.675341 2019] [wsgi:error] [pid 2489]     response = get_response(request)
[Fri Feb 15 09:53:18.675344 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 126, in _get_r$
[Fri Feb 15 09:53:18.675347 2019] [wsgi:error] [pid 2489]     response = self.process_exception_by_middleware(e, request)
[Fri Feb 15 09:53:18.675349 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 124, in _get_r$
[Fri Feb 15 09:53:18.675352 2019] [wsgi:error] [pid 2489]     response = wrapped_callback(request, *callback_args, **callback_kwargs)
[Fri Feb 15 09:53:18.675354 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/views.py", line 125, in test
[Fri Feb 15 09:53:18.675357 2019] [wsgi:error] [pid 2489]     form = TestForm(wordsdict)
[Fri Feb 15 09:53:18.675359 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/forms.py", line 21, in __init__
[Fri Feb 15 09:53:18.675362 2019] [wsgi:error] [pid 2489]     json.dumps(test_dict, f, indent=2, ensure_ascii=False)
[Fri Feb 15 09:53:18.675365 2019] [wsgi:error] [pid 2489]   File "/usr/lib/python3.5/json/__init__.py", line 179, in dump
[Fri Feb 15 09:53:18.675367 2019] [wsgi:error] [pid 2489]     fp.write(chunk)
[Fri Feb 15 09:53:18.675371 2019] [wsgi:error] [pid 2489] UnicodeEncodeError: 'ascii' codec can't encode character '\\u90fd' in position 1: ordinal not in range(128)
[Fri Feb 15 09:53:18.675376 2019] [wsgi:error] [pid 2489]
[Fri Feb 15 09:53:19.498167 2019] [wsgi:error] [pid 2652] /path/to/website/duotool

I can't see any differences between the two versions that would cause this. Where do I go from here?

addohm
  • 2,248
  • 3
  • 14
  • 40

1 Answers1

1

I think you should specify the encoding for the file you are writing to when you open it. This should work:

   file = open(self._json_path, 'w',encoding='utf-8')
   file.write(json.dumps(your_json))
   file.close()
  • 1
    I didn't do exactly this, but close enough. The post mentioned above showed `utf8` as the string to put into the `encoding=` argument. Changed it to `utf-8` and it work. P.S. The reason that the file didn't have any information was because I changed form `dump` to `dumps` in troubleshooting. – addohm Feb 15 '19 at 11:05
  • In Python, `utf8` is an alias for `utf-8`. I think the problem was elsewhere. – Alastair McCormack Feb 15 '19 at 12:11