When I query the table using
trigram_similar
orcontains
the index is used, as expected.When I query the same table using
icontains
, Django performs a sequential scan usingUPPER
.
The difference is 3ms vs 240ms.
Should I create a lowercase index and match with
contains
? (How could this be done?)Should I create a field where all the contents will be lower cased and index that field?
Something else?
The model:
class Name(models.Model):
name_en = models.CharField(max_length=127)
...
class Meta:
indexes = [
GinIndex(
name="name_en_gin_trigram",
fields=["name_en"],
opclasses=["gin_trgm_ops"],
)
]
The query that uses the index:
>>> Name.objects.filter( Q(name_en__contains='eeth') | Q(name_en__trigram_similar='eeth') ) SELECT * FROM "shop_name" WHERE ("shop_name"."name_en"::text LIKE '%eeth%' OR "shop_name"."name_en" % 'eeth') LIMIT 21;
The resulting query plan:
Limit (cost=64.06..90.08 rows=7 width=121) (actual time=0.447..2.456 rows=14 loops=1) -> Bitmap Heap Scan on shop_name (cost=64.06..90.08 rows=7 width=121) (actual time=0.443..2.411 rows=14 loops=1) Recheck Cond: (((name_en)::text ~~ '%eeth%'::text) OR ((name_en)::text % 'eeth'::text)) Rows Removed by Index Recheck: 236 Heap Blocks: exact=206 -> BitmapOr (cost=64.06..64.06 rows=7 width=0) (actual time=0.371..0.378 rows=0 loops=1) -> Bitmap Index Scan on name_en_gin_trigram (cost=0.00..20.03 rows=4 width=0) (actual time=0.048..0.049 rows=15 loops=1) Index Cond: ((name_en)::text ~~ '%eeth%'::text) -> Bitmap Index Scan on name_en_gin_trigram (cost=0.00..44.03 rows=4 width=0) (actual time=0.318..0.320 rows=250 loops=1) Index Cond: ((name_en)::text % 'eeth'::text) Planning Time: 0.793 ms Execution Time: 2.531 ms (12 rows)
If I use
icontains
the index is not used:>>> Name.objects.filter( Q(name_en__icontains='eeth') | Q(name_en__trigram_similar='eeth') ) SELECT * FROM "shop_name" WHERE (UPPER("shop_name"."name_en"::text) LIKE UPPER('%eeth%') OR "shop_name"."name_en" % 'eeth') LIMIT 21;
The resulting query plan:
Limit (cost=0.00..95.61 rows=21 width=121) (actual time=10.513..244.244 rows=14 loops=1) -> Seq Scan on shop_name (cost=0.00..1356.79 rows=298 width=121) (actual time=10.509..244.195 rows=14 loops=1) Filter: ((upper((name_en)::text) ~~ '%EETH%'::text) OR ((name_en)::text % 'eeth'::text)) Rows Removed by Filter: 36774 Planning Time: 0.740 ms Execution Time: 244.299 ms (6 rows)