37

When uploading files with non-ASCII characters I get UnicodeEncodeError:

Exception Type: UnicodeEncodeError at /admin/studio/newsitem/add/
Exception Value: 'ascii' codec can't encode character u'\xf8' in position 78: ordinal not in range(128)

See full stack trace.

I run Django 1.2 with MySQL and nginx and FastCGI.

This is a problem that is fixed according to the Django Trac database, but I still have the problem. Any suggestions on how to fix are welcome.

EDIT: This is my image field:

image = models.ImageField(_('image'), upload_to='uploads/images', max_length=100)
vorpyg
  • 2,505
  • 4
  • 25
  • 22
  • Can you give the model/field definition as well? In particular I'm interested in seeing the `upload_to` definition. – Mark Lavin Sep 15 '10 at 14:21
  • Updated with upload_to definition. – vorpyg Sep 16 '10 at 12:13
  • 2
    For anyone still landing here check the Django ticket's last comment by akaihola, he says: "Debian runs Apache with the LANG=C locale by default, which breaks uploading files with special characters in their names at least when running with mod_wsgi. Activating a UTF-8 locale in /etc/apache2/envvars should resolve the issue" The ticket: http://code.djangoproject.com/ticket/6009 – Tuukka Mustonen Jun 21 '11 at 13:34
  • 1
    This applies for nginx as well. Check my answer here: http://stackoverflow.com/a/7602446/108763 – vorpyg Dec 30 '11 at 16:16

12 Answers12

41

For anyone encountering this problem when running Django with Supervisor, the solution is to add e.g. the following to the supervisord section of Supervisor's configuration:

environment=LANG="en_US.utf8", LC_ALL="en_US.UTF-8", LC_LANG="en_US.UTF-8"

This solved the problem for me in Supervisor 3.0a8 running on Debian Squeeze.

Also make sure Supervisor re-reads the configuration by running:

supervisorctl reread
supervisorctl restart myservice

(thanks @Udi)


For upstart, add in your /etc/init/myservice.conf:

env LANG="en_US.utf8"
env LC_ALL="en_US.UTF-8"
env LC_LANG="en_US.UTF-8"`

(thanks @Andrii Zarubin; see Environment Variables in Upstart documentation for more information)

akaihola
  • 26,309
  • 7
  • 59
  • 69
  • 6
    Make sure you /etc/init.d/supervisor stop and /etc/init.d/supervisor start for the change to take effect. Just restarting won't work. – amjoconn Jul 16 '12 at 13:32
  • If you get this error _Unexpected end of key/value pairs_, you will need to quote the values. e.g. environment=LANG='en_US.utf8'. https://lists.supervisord.org/pipermail/supervisor-users/2010-March/000539.html – amos Nov 15 '13 at 09:43
  • 2
    You can force reading of configuration files with `supervisorctl reread` and `supervisorctl restart myservice` instead of stopping and starting the whole daemon. – Udi May 13 '14 at 08:39
  • Why it's not use default system locale settings? – Denis Nikanorov Aug 10 '15 at 15:30
  • 2
    Sadly I can't upvote this answer enough, you really save my sunday, thanks! – lithiium Aug 23 '15 at 17:19
  • it's working when with this solution and reload supervisor – runforever Mar 30 '16 at 14:47
  • 1
    If you are using upstart - you must add `env LANG="en_US.utf8" env LC_ALL="en_US.UTF-8" env LC_LANG="en_US.UTF-8"` – Andrii Zarubin May 30 '16 at 06:43
  • Fixed the issue when supervisor was running another python program as well (not Django) – taari Jan 10 '17 at 04:39
  • You saved my day mister @akaihola. Thank you very much. – Alex Sep 29 '17 at 18:04
24

In situations where you must display a unicode string in a place that only accepts ascii (like the console or as a path) you must tell Python that you want it to replace the non ascii characters best effort.

>> problem_str = u'This is not all ascii\xf8 man'
>> safe_str = problem_str.encode('ascii', 'ignore')
>> safe_str
'This is not all ascii man'

Encoding issues are prevented in the admin by the cautious handing of Django templating, but if you have ever added custom columns and forgotten to convert the values to ascii, or you override the str method for a model and forget to do this, you will get the same error, preventing template rendering.

If this string were saved into your (hopefully utf8) database there would be no problem, it looks like you are trying to upload a file that uses the title of an entity that has a non ascii character.

Lincoln B
  • 2,184
  • 1
  • 13
  • 12
  • Thanks! I stumbled upon this answer after fruitless search for a simple question: how do I send an email with non-Latin characters in Python? Your solution works! – skanatek Mar 26 '13 at 20:30
14

Hope this would help. In my case, I'm running django through daemontools.

Setting

export LANG='en_US.UTF-8'
export LC_ALL='en_US.UTF-8'

in run script before executing manage.py resolved the issue with uploads filename

12

After investigating this some more I found out that I hadn't set the charset in my main Nginx config file:

http {
  charset  utf-8;
}

By adding the above, the problem disappeared and I think that this is the correct way of handling this issue.

vorpyg
  • 2,505
  • 4
  • 25
  • 22
  • 4
    This could only work if nginx is directly running the backend code. Assuming it is a proxy for something like gunicorn or uwsgi you will have to configure the wsgi server's environment to use UTF-8. Adding this to your Nginx config doesn't hurt, but it likely won't solve your problem. – amjoconn Jul 16 '12 at 18:04
  • As already mentioned by @amjoconn, in my case the problem were solved by adding "env = LC_ALL=ru_RU.UTF-8" to my uwsgi-config file – Vasiliy Toporov Aug 13 '14 at 12:20
11

akaihola's answer was helpful. For those who run django app with uWSGI managed via upstart script, just add these lines to your /etc/init/yourapp.conf

env LANG="en_US.utf8"
env LC_ALL="en_US.UTF-8"
env LC_LANG="en_US.UTF-8"

It solved the problem for me.

PawelRoman
  • 6,122
  • 5
  • 30
  • 40
  • 3
    Thanks! This is the way that solved my problem! `env LANG="en_US.UTF-8" env LC_ALL="en_US.UTF-8" env LC_LANG="en_US.UTF-8"`. Note that it is `env` not `export`. This is the syntax to use under System V script (/etc/init/xxx.conf). This error has cost me hours. – moonkey Jun 17 '15 at 05:02
4

Another useful option that avoids rewriting code is to change the default encoding for python.

If you're using virtualenv you can change (or create if doesn't exist) env/lib/python2.7/sitecustomize.py and add:

import sys
sys.setdefaultencoding('utf-8')

or, if you are in a production system, you can do the same to /usr/lib/python2.7/sitecustomize.py

Enric Mieza
  • 342
  • 2
  • 10
4

It's hard to say without seeing a little more code but it looks to be related to this question: UnicodeDecodeError on attempt to save file through django default filebased backend.

Looking through the Django ticket mentioned it would seem you should follow something similar to the deployment docs on "If you get a UnicodeEncodeError":
https://docs.djangoproject.com/en/1.4/howto/deployment/modpython/#if-you-get-a-unicodeencodeerror

(I know this is for Apache/mod_python but my guess would be it's the same root issue of file system encoding not being UTF-8 and there is a similar fix when using nginx)

EDIT: From what I can tell this nginx module would be the equivalent fix: http://wiki.nginx.org/NginxHttpCharsetModule

Community
  • 1
  • 1
Mark Lavin
  • 24,664
  • 5
  • 76
  • 70
  • I suspect it could have something to do with this. I tried adding an u in front of the string, as described here: http://stackoverflow.com/questions/2457087/unicodedecodeerror-on-attempt-to-save-file-through-django-default-filebased-backe/2458200#2458200 without luck. Have you got a link to the nginx fix? – vorpyg Sep 16 '10 at 12:18
  • Thanks, still not working, though. I've tried setting the locale, as indicated in the Django docs, and also tried adding charset utf8 to my nginx config. Maybe I'll just have to rewrite the save method to rename the file first… – vorpyg Sep 16 '10 at 19:32
  • The link to doc is dead. – Igor Medeiros Oct 31 '13 at 23:27
4

As said before, it is related to locale. For exemple, if you use gunicorn to serve your django application, you may have an init.d script (or, as me, a runit script), where you can set the locale.

To solve UnicodeEncodeError with file upload, put something like export LC_ALL=en_US.UTF8 in your script that run your app.

For example, this is mine (using gunicorn and runit):

#!/bin/bash
export LC_ALL=en_US.UTF8
cd /path/to/app/projectname
exec gunicorn_django -b localhost:8000 --workers=2

Also, you can check your locale in your template, using this in your view:

import locale
data_to_tpl = {'loc': locale.getlocale(), 'lod_def': locale.getdefaultlocale()}

And just disply {{loc}} - {{loc_def}} in your template.

You will have more information about your locale settings! That was very usefull for me.

ManuPK
  • 11,623
  • 10
  • 57
  • 76
Exirel
  • 61
  • 1
  • 4
3

If you're using django and python 2.7 this fixes it for me:

@python_2_unicode_compatible
class Utente(models.Model):

see https://docs.djangoproject.com/en/dev/ref/utils/#django.utils.encoding.python_2_unicode_compatible

max4ever
  • 11,909
  • 13
  • 77
  • 115
3

Using python 2.7.8 and Django 1.7, I solved my problem by importing:

from __future__ import unicode_literals

and using force_text():

from django.utils.encoding import force_text
daveoncode
  • 18,900
  • 15
  • 104
  • 159
3

Just building on answers from this thread and others...

I had the same issue with genericpath.py giving a UnicodeEncodeError when attempting to upload a file name with non ASCII characters.

I was using nginx, uwsgi and django with python 2.7.

Everything was working OK locally but not on the server

Here are the steps I took 1. added to /etc/nginx/nginx.conf (did not fix the problem)

http {
    charset utf-8;
}
  1. I added this line to etc/default/locale (did not fix the problem)

    LANGUAGE="en_US.UTF-8"

  2. I followed the instructions here listed under the heading 'Success' https://code.djangoproject.com/wiki/ExpectedTestFailures (did not fix the problem)

    aptitude install language-pack-en-base
    
  3. Found across this ticket https://code.djangoproject.com/ticket/17816 which suggested testing a view on the server to what was happening with locale information

In your view

import locale
locales = "Current locale: %s %s -- Default locale: %s %s" % (locale.getlocale() + locale.getdefaultlocale())

In your template

{{ locales }}

For me, the issue was that I had no locale and no default locale on my Ubuntu server (though I did have them on my local OSX dev machine) then files with non ASCII file names/paths will not upload correctly with python raising a UnicodeEncodeError, but only on the production server.

Solution

I added this to both my site and my site admin uwsgi config files e.g. /etc/uwsgi-emperor/vassals/my-site-config-ini file

env = LANG=en_US.utf8
lukeaus
  • 11,465
  • 7
  • 50
  • 60
0

None of the answers worked for me (using Apache on Ubuntu with Django 1.10); I chose to remove accents from the file name (normalize) as below:

def remove_accents(value):
    nkfd_form = unicodedata.normalize('NFKD', str(value))
    return "".join([c for c in nkfd_form if not unicodedata.combining(c)])

uploaded_file = self.cleaned_data['data']

# We need to remove accents to get rid of "UnicodeEncodeError: 'ascii' codec can't encode character" on Ubuntu
uploaded_file.name = remove_accents(uploaded_file.name)
SaeX
  • 17,240
  • 16
  • 77
  • 97