12

Hi I find that on postgres database, we can't configure default accent sensivity (on old mail exchanges).

Is there a way to have a _icontains also insensitive to special caracters (é, è, à, ç, ï) or I must use postgres regex to replace both side with _iregex (ç->c, é->e ...)?

edit: this question is old, and is kept for users of django before 1.8. For those using latest django versions, here the new way: https://docs.djangoproject.com/en/dev/ref/contrib/postgres/lookups/#std:fieldlookup-unaccent

christophe31
  • 6,359
  • 4
  • 34
  • 46

6 Answers6

12

EDIT: Django 1.8 makes accent unsensitive lookup for postgresql builtin. https://docs.djangoproject.com/en/dev/ref/contrib/postgres/lookups/#std:fieldlookup-unaccent

In fact in postgres contrib (8.4+) there is an unaccent function to search easily:

for postgres 9/8.5:

for postgres 8.4:

here an example of usage from django:

vals = MyObject.objects.raw(
        "SELECT * \
         FROM myapp_myobject \
         WHERE unaccent(name) LIKE \'%"+search_text+"%'")

You may apply apply unaccent on text-search before comparison.

Option I made is:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# parts of credits comes to clarisys.fr
from django.db.backends.postgresql_psycopg2.base import *

class DatabaseOperations(DatabaseOperations):
    def lookup_cast(self, lookup_type):
        if lookup_type in('icontains', 'istartswith'):
            return "UPPER(unaccent(%s::text))"
        else:
            return super(DatabaseOperations, self).lookup_cast(lookup_type)

class DatabaseWrapper(DatabaseWrapper):
    def __init__(self, *args, **kwargs):
        super(DatabaseWrapper, self).__init__(*args, **kwargs)
        self.operators['icontains'] = 'LIKE UPPER(unaccent(%s))'
        self.operators['istartswith'] = 'LIKE UPPER(unaccent(%s))'
        self.ops = DatabaseOperations(self)

Use this file base.py in a folder and use this folder as db backend. icontains and istartswith are now case and accent insensitive.

christophe31
  • 6,359
  • 4
  • 34
  • 46
  • 1
    How to use the folder as a db backend? Should I just put it on usr/local/lib/python2.7/dist-packages/django/db/backends directory or I must declare it in some way?? – Falcoa Feb 19 '15 at 14:36
  • 1
    if you don't use django 1.8, you can simply use python dotted path in your database backend DATABASES = {"default":{ "ENGINE":"my_app.my_backend_module" [...]}} – christophe31 Mar 04 '15 at 13:44
  • i've add the python dotted path in database backend but got an ImproperlyConfigured error: raise ImproperlyConfigured(error_msg) django.core.exceptions.ImproperlyConfigured: 'myAPP.backends.base' isn't an available database backend. Try using 'django.db.backends.XXX', where XXX is one of: u'mysql', u'oracle', u'postgresql_psycopg2', u'sqlite3' Error was: No module named base Do you know what should I do to solve this error? – Falcoa Mar 07 '15 at 22:46
7

I managed to install unaccent from postgresql contrib, but this answer that patches django didn't work. load_backend on django.db.utils enforces that the backend name starts with django.db.backends.

The solution that worked for me was inserting this code in one of my modules:

from django.db.backends.postgresql_psycopg2.base import DatabaseOperations, DatabaseWrapper

def lookup_cast(self, lookup_type, internal_type=None):
    if lookup_type in('icontains', 'istartswith'):
        return "UPPER(unaccent(%s::text))"
    else:
        return super(DatabaseOperations, self).lookup_cast(lookup_type, internal_type)

def patch_unaccent():
    DatabaseOperations.lookup_cast = lookup_cast
    DatabaseWrapper.operators['icontains'] = 'LIKE UPPER(unaccent(%s))'
    DatabaseWrapper.operators['istartswith'] = 'LIKE UPPER(unaccent(%s))'
    print 'Unaccent patch'

patch_unaccent()

Now unaccent searches are working fine, even inside django admin! Thanks for your answer above!

bbrik
  • 2,936
  • 1
  • 19
  • 7
  • 1
    weird, my solution have some specificity. The file have to be named base.py, the settings.py must use his parent folder as backend, and you must not replace the wildcard by used class in this file. (it's kind of module inheritance.) – christophe31 Sep 15 '11 at 13:19
  • But your solution is great. Mine just allow to set the behaviour at the settings.py level. (Yours may too if you put this patch in an isolated app models.py.) – christophe31 Sep 15 '11 at 13:25
  • 1
    That's really wierd, cause I did named the file base.py and used it's parent folder as the backend, but got an ImproperlyConfigured error. Anyway, thanks for pointing me in the right direction. Unnacented searches are something my clients really need. – bbrik Sep 16 '11 at 18:05
  • 1
    Thanks! this solution worked for me adding it at the end of settings.py – Bruno Rocha - rochacbruno Sep 11 '13 at 18:59
  • I've also inserted this code at the end of settings.py and it worked for me. However, later I add a validator into my models.py file and this validator just work when I remove this piece of code from settings.py. Besides that I've noticed that the python print command doesn't work when I have this code in settings.py. Any suggestion about this? – Falcoa Mar 07 '15 at 22:21
  • I used the solution below and my problems were solved. – Falcoa Mar 08 '15 at 00:37
2

I don't believe you'll be able to use the standard Django field-lookups for this unless you store a non-accented version of your text in another column and do the lookup there. You could add a duplicate column with editable=False and override the model's save() method to update that field from the original accented text.

Python: Remove accents from unicode

PostgreSQL Wiki: Strip accents from strings, and output in lowercase

Community
  • 1
  • 1
shadfc
  • 6,104
  • 3
  • 25
  • 19
  • Sad answer, I hope to find a better way to do than add a column for each searchable text field in my db. I don't see exactly how bu I'm almost sure regex can do the job. – christophe31 Apr 11 '11 at 13:24
2

Answer from @SaeX in another thread: How can I activate the unaccent extension on an already existing model


A migration file needs to be manually made and applied.

  1. First, create an empty migration:

./manage.py makemigrations myapp --empty

  1. Then open the file and add UnaccentExtension to operations:
from django.contrib.postgres.operations import UnaccentExtension


class Migration(migrations.Migration):

    dependencies = [
        (<snip>)
    ]

    operations = [
        UnaccentExtension()
    ]
  1. Now apply the migration using ./manage.py migrate.

If you'd get following error during that last step:

django.db.utils.ProgrammingError: permission denied to create extension "unaccent"
HINT:  Must be superuser to create this extension.

... then temporarily allow superuser rights to your user by performing postgres# ALTER ROLE <user_name> SUPERUSER; and its NOSUPERUSER counterpart. pgAdminIII can do this, too.

Now enjoy the unaccent functionality using Django:

>>> Person.objects.filter(first_name__unaccent=u"Helène")
[<Person: Michels Hélène>]

Again, this answer belongs to @SaeX

But for me his answer still didn't work, so don't forget to add the line django.contrib.postgresin INSTALLED_APPS (settings.py)

1

I just released (a few days ago) the django-unaccent library that add operators to the django ORM for unaccent search. It monkeypatch the django ORM and uses the unaccent() function of postgres to do so.

Please, check this out => https://github.com/djcoin/django-unaccent

Coding Mash
  • 3,338
  • 5
  • 24
  • 45
djcoin
  • 86
  • 4
0

I'm working on an unaccent lookup field for django and postgreSQL. It's on github: https://github.com/marianobianchi/django-accent-free-lookup

It's working fine for now, but it still need a lot of work. I'm using it and it doesn't show any problems for now.

The way of using it is to make a new Manager for the model you want to have unaccents searches (look at the example stored at the end of managers.py file at the project).

The lookups I have already implement are:

"__aexact"

"__aiexact"

"__acontains"

"__aicontains"

They are equivalent to the common field lookups that come with django:

"__exact"

"__iexact"

"__contains"

"__icontains"

with the difference that they are "accent insensitive" for the most common accented characters.

Community
  • 1
  • 1
marianobianchi
  • 8,238
  • 1
  • 20
  • 24
  • I find my way more interesting because if anybody use admin search it will be unaccented, even more, if you want to switch to mysql, you'll loose accent insensitivity but evrything else will work... – christophe31 May 22 '12 at 12:15
  • You are right, this is just another posibility, probably not the best for everyone. I didn't test it with mysql yet, but if mysql support django regex lookup, this app should work as fine as with postgreSQL. – marianobianchi May 22 '12 at 14:14
  • regexlookup use sql regex language or python regex language for sqlite so, if regex syntax you use is synonim for postgres and mysql it should work but it will probably not work with sqlite. – christophe31 May 23 '12 at 07:18
  • Dead link. No longer relevant anyway. – SaeX Jul 13 '15 at 19:04