2

I put this in settings.py:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'HOST': 'localhost',
        'NAME': 'db',
        'USER': 'root',
        'PASSWORD': '',
        'OPTIONS': {
            'charset': 'utf8mb4',
            'init_command': 'set collation_connection=utf8mb4_unicode_ci',
        },
    },
}

Then I used the shell to check that it worked:

$ ./manage.py shell
>>> from django.db import connection
>>> cursor = connection.cursor()
>>> cursor.execute("show variables like 'collation_connection'")
>>> print cursor.fetchall()
((u'collation_connection', u'utf8mb4_general_ci'),)

Unfortunately, what I've learned from inspecting my query log is that MySQLdb does this when it connects:

set collation_connection='utf8mb4_unicode_ci'
SET NAMES utf8mb4

That's right. It executes set names after my init command. This sets the collation back to the default.

Omitting the 'charset' option doesn't help. If I do, it will call set names utf8 instead, which is even worse. I tried making the set names command part of my 'init_command' in case it wouldn't clobber my collation if there was nothing to change, but no, it still clobbers it.

I can't fork the Python library MySQLdb because I'm running my app on Google App Engine and MySQLdb is part of App Engine.

Nick Retallack
  • 18,986
  • 17
  • 92
  • 114
  • Oh, the list of how "3rd party software gets in the way" grows! – Rick James Apr 22 '17 at 19:01
  • Please provide a situation where the wrong value for `collation_connection` causes you trouble. – Rick James Apr 22 '17 at 19:05
  • 1
    `collation_connection` only matters in operations that don't touch database columns. However, it's annoying that there would be an inconsistency there. I like to remove weird edge cases if possible. The collation affects a wide variety of things, such as the behavior of any string comparison including equality (=) and the sorting order. – Nick Retallack Apr 29 '17 at 00:44

2 Answers2

3

I was able to accomplish this by monkey-patching Django:

from django.db.backends.mysql import base
old_get_new_connection = base.DatabaseWrapper.get_new_connection
def get_new_connection(self, conn_params):
  conn = old_get_new_connection(self, conn_params)
  conn.query("set names 'utf8mb4' collate 'utf8mb4_unicode_520_ci'")
  return conn

base.DatabaseWrapper.get_new_connection = get_new_connection

Btw, this patch was necessary to even use the utf8mb4 character set on App Engine because they are using an old version of MySQL Connector C which doesn't support that character set. Without this character set, your app will crash if you attempt to insert any four-byte UTF8 characters, and any existing ones in the database will show up as question marks.

Nick Retallack
  • 18,986
  • 17
  • 92
  • 114
-1

I need to set default Collation for MySQL tables with Django 3.*, I'm using mysqlclient, my settings are:

DATABASES = {
'default': {
    'ENGINE': 'django.db.backends.mysql',
    'NAME': get_env_val('MYSQL_DB_NAME'),
    'USER': get_env_val('MYSQL_DB_USER'),
    'PASSWORD': get_env_val('MYSQL_DB_PASSWORD'),
    'HOST': get_env_val('MYSQL_DB_HOST'),
    'PORT': get_env_val('MYSQL_DB_PORT'),
    'OPTIONS': {
        'charset': 'utf8mb4',
            }
         }
    }