14

I have a project written in Django. All fields that are supposed to store some strings are supposed to be in UTF-8, however, when I run

manage.py syncdb

all respective columns are created with cp1252 character set (where did it get that -- I have no idea) and I have to manually update every column...

Is there a way to tell Django to create all those columns with UTF-8 encoding in the first place?

BTW, I use MySQL.

Maxim Sloyko
  • 15,176
  • 9
  • 43
  • 49

3 Answers3

21

Django does not specify charset and collation in CREATE TABLE statements. Everything is determined by database charset. Doing ALTER DATABASE ... CHARACTER SET utf8 COLLATE utf8_general_ci before running syncdb should help.

For connection, Django issues SET NAMES utf8 automatically, so you don't need to worry about default connection charset settings.

drdaeman
  • 11,159
  • 7
  • 59
  • 104
  • If you are curious, You may see `CREATE TABLE` SQL stataments by doing `./manage.py sqlall appname`. I don't have MySQL server nearby, but I'm sure there won't be any charset/collation specified. So, collations will be determined from database settings. – drdaeman Jul 29 '09 at 08:03
  • 1
    Yeah, that seems to do the trick, thank you. That cp1252 encoding seems to be mysqladmin's issue. – Maxim Sloyko Jul 29 '09 at 08:11
  • 3
    This is an old answer but just noting that MySql's `utf8` doesn't support unicode correctly, you should use the `utf8mb4` charset for proper unicode support https://mathiasbynens.be/notes/mysql-utf8mb4 – John Carter Oct 01 '18 at 22:26
4

Django’s database backends automatically handles Unicode strings into the appropriate encoding and talk to the database. You don’t need to tell Django what encoding your database uses. It handles it well, by using you database's encoding.

I don't see any way you can tell django to create a column, using some specific encoding. As it appears to me, there is absolutely some previous MySQL configuration affecting you. And despite of doing it manually for all column, use these.

CREATE DATABASE db_name
    [[DEFAULT] CHARACTER SET charset_name]
    [[DEFAULT] COLLATE collation_name]

ALTER DATABASE db_name
    [[DEFAULT] CHARACTER SET charset_name]
    [[DEFAULT] COLLATE collation_name]
simplyharsh
  • 35,488
  • 12
  • 65
  • 73
2

What is your MySQL encoding set to?

For example, try the following from the command line:

 mysqld --verbose --help | grep character-set

If it doesn't output utf8, then you'll need to set the output in my.cnf:

[mysqld]
character-set-server=utf8
default-collation=utf8_unicode_ci

[client]
default-character-set=utf8

This page has some more information:

ars
  • 120,335
  • 23
  • 147
  • 134
  • If I'm not mistaken, Django already does `SET NAMES utf8` on connection. – drdaeman Jul 29 '09 at 08:00
  • It says latin1, so cp1252 was the mysqladmin's issue. These settings you suggest seem to affect only command-line client – Maxim Sloyko Jul 29 '09 at 08:15
  • No, the [mysqld] section is specifically for server time. I have default-collation in my file, but perhaps that's changed in versions. See the section "Specify character settings at server startup" on this page: http://dev.mysql.com/doc/refman/5.0/en/charset-applications.html – ars Jul 29 '09 at 08:51