I have a keyword
table containing a few millions entries. It is linked by a many to many relationship to an element
table. I would like to get all the element ids matching a keyword. I tried this query and there is no problem, it returns rows in few milliseconds.
SELECT element_id FROM element_keyword
JOIN keyword ON keyword.id = element_keyword.keyword_id
WHERE keyword.value like 'LOREM';
Execution plan
"Nested Loop (cost=278.50..53665.56 rows=65 width=4)"
" -> Index Scan using keyword_value_index on keyword (cost=0.43..8.45 rows=1 width=4)"
" Index Cond: ((value)::text = 'LOREM'::text)"
" Filter: ((value)::text ~~ 'LOREM'::text)"
" -> Bitmap Heap Scan on element_keyword (cost=278.07..53510.66 rows=14645 width=8)"
" Recheck Cond: (keyword_id = keyword.id)"
" -> Bitmap Index Scan on element_keyword_keyword_index (cost=0.00..274.41 rows=14645 width=0)"
" Index Cond: (keyword_id = keyword.id)"
However when i put a wild card at the end of my search string the request becomes really slow. (~60000ms)
SELECT element_id FROM element_keyword
JOIN keyword ON keyword.id = element_keyword.keyword_id
WHERE keyword.value like 'LOREM%';
Execution plan:
"Hash Join (cost=12.20..3733738.08 rows=19502 width=4)"
" Hash Cond: (element_keyword.keyword_id = keyword.id)"
" -> Seq Scan on element_keyword (cost=0.00..3002628.08 rows=194907408 width=8)"
" -> Hash (cost=8.45..8.45 rows=300 width=4)"
" -> Index Scan using keyword_value_index on keyword (cost=0.43..8.45 rows=300 width=4)"
" Index Cond: (((value)::text ~>=~ 'LOREM'::text) AND ((value)::text ~<~ 'LOREN'::text))"
" Filter: ((value)::text ~~ 'LOREM%'::text)"
Even when the wildcard does not give more result, the query is slow.
I created an index on keyword(value) and element_keyword(keyword_id)
CREATE INDEX "keyword_value_index" ON "keyword" (value text_pattern_ops);
CREATE INDEX "element_keyword_keyword_index" ON "element_keyword" (keyword_id);
What's really happening behind ? How could i solve it ?
UPDATE
I'm not sure if this could help but here's a few more tests :
select id from keyword where value like 'LOREM%';
-> 6 ids retrieved in 17ms
select * from element_keyword where keyword_id in (1961746,1961710,2724258,2121442,1633163,1026116);
-> 40 rows retrieved in 17ms
select * from element_keyword where keyword_id in (select id from keyword where value like 'LOREM%');
-> 40 rows in 63221 ms