7

Is it possible to make query with a collation different from database table have?

iskorum
  • 1,137
  • 1
  • 16
  • 27
  • Do you want use your django model on an existing database with a different collation or do you want a query beyond django's model framework? – tjati Apr 03 '14 at 14:46
  • Hmmm, using with a different collation seems work fine but maybe in future i would need to make query with collation database have. i prefer to use different collation just with that query if it is possible. with 'beyond django's model framework' do you mean raw sql query? – iskorum Apr 03 '14 at 14:58
  • yeah, if you want to work with a database which is not affiliated to django's databases. – tjati Apr 03 '14 at 15:01
  • No no, it is the database django created if i you mean that? – iskorum Apr 03 '14 at 15:05
  • Okay, when Django's is the creator of the database, why do you need a different collation? – tjati Apr 03 '14 at 15:06
  • 1
    i have a column called `name` which is unique. in my language character `ö` differ to `o` so it changed to a different collation from `utf8_general_ci`. But know i want to make search query with `ö` and `o` as same. – iskorum Apr 03 '14 at 15:10
  • This is a different question, isn't it? :) – tjati Apr 03 '14 at 15:11
  • 1
    why? i want make a query with a collation `utf8_general_ci` which database column collation is different. You mean i need to solve this problem in a different way from a custom collation query? – iskorum Apr 03 '14 at 15:13

4 Answers4

13

Using extra() is a little messy. Something similar can now be achieved with Func() expression (since Django 1.8):

username_ci = Func(
    'username',
    function='utf8_general_ci',
    template='(%(expressions)s) COLLATE "%(function)s"')

This can be used in annotate():

User.objects.annotate(uname_ci=username_ci).filter(uname_ci='joeblow').exists()

Or in order_by() to override default collation rules when sorting:

User.objects.order_by(username_ci)

Now, it still may seem messy, but if you look at the docs and code of Func(), you will discover that it is very easy to subclass it and make a reusable collation setter.

I used this trick with Postgres database.

Piotr Ćwiek
  • 1,580
  • 13
  • 19
3

Here is how you can use a specific collation instead of the default collation for a given table/column. I'm assuming you always want that to be the case insensitive utf8_general_ci, but you can easily change that in the code or add it as a variable.

Note the use of the params kwarg instead of the db literal function. Params exists for the exact same purpose.

def iexact(**kw):
    fields = [['%s=%%s collate utf8_general_ci'%field,value] for (field,value) in kw.items()]
    return dict(where=[f[0] for f in fields], params=[f[1] for f in fields])

if User.objects.extra(**iexact(username='joeblow')).exists():
    status = "Found a user with this username!"
  • 1
    Ohh, When you focus on something, you don't think of other options. I read django `where` usage documentation nearly 3 times but I couldn't realize this, that is embarrassing :) This is the answer. Thank you :) – iskorum Apr 24 '14 at 12:16
1

I solve this using bit of a hack;

Django's extra method is just like raw method, they both using the query statetment directly;

MyModel.objects.extra(where=["name LIKE '%%" + name + "%%' COLLATE utf8_general_ci"])

But like this sql injection is possible. We need to escape name variable. I searched a lot for a function which just escapes a string for db. Found one in MySQL-python package but it can't escape unicode strings. Also package has literal method in connection but to use it we need an instance (maybe it is for db characteristic).

At last I used Django's db.connection.cursor.

from django.db import connection
cursor = connection.cursor()
name = cursor.db.connection.literal(name)[1:-1]  # [1:-1] excluding quotes

With this way we also need an instance but I suppose this not require a db connection. And I suppose this method db independent. If I am wrong please correct me.

iskorum
  • 1,137
  • 1
  • 16
  • 27
  • **FYI**: using `.extra()` with parameters this way is _unsafe_! more on that topic: https://blog.r2c.dev/2020/preventing-sql-injection-a-django-authors-perspective/ – Florian May 20 '20 at 10:00
1

This above solution works. In case of getting the reverse order the following snippet

sort_value = sort.strip()
if sort_value in ['name', '-name']:
    sort = Func('name', function='C', template='(%(expressions)s) COLLATE "%(function)s"')
if sort_value in ['-name']:
    f_res = queryset.order_by(sort).reverse()
else:
    f_res = queryset.order_by(sort)
return f_res
Vijay Anand Pandian
  • 1,027
  • 11
  • 23