For searching on specific keys within a jsonb
column, I wanted to create an index on the column.
Using: Postgres 10.2
Ignoring some unrelated columns, I have the table animals
with these columns (omitting some unrelated columns):
animalid PK number
location (text)
type (text)
name (text)
data (jsonb) for eg: {"age": 2, "tagid": 11 }
I would need to search based on: location
, type
and tagId
. Like:
where location = ? and type = 'cat' and (data ->> 'tagid') = ?
Other important points:
- Only animals of type cats will be having a tag id, this is a new animal type being added now.
- The number of "cats" in the whole table would be less compared to the other types of animals.
- The table is huge, with millions of rows - and partitioned.
How do I make sure that the search is fast? Options I have considered:
- Make a separate table cats to store:
animal_id
,location
,tagId
(although FK to the partitioned parent table is not possible) - Create an index on
location
,type
and jsonb key. - Create a new (indexed) column
tagId
- which would be null for all animals but cat.
I do have an index on other columns on the table - but a bit confused on how to create an index to make the search for cats based on the tagid
fast. Any suggestions?
UPDATE (Ignoring partitions):
(Testing on a partitioned table)
So I decided to go with the option as suggested by Erwin and tried creating an index
CREATE INDEX ON animals_211 (location, ((data->>'tagid')::uuid)) WHERE type = 'cat';
And tried an EXPLAIN on the query (using the partitioned table to keep it simple):
explain select * from animals_211 a
where a.location = 32341
and a.type = 'cat'
and (data->>'tagid')::uuid = '5e54c1d9-3ea0-4bca-81d6-1000d90cc42c'
And from the results, it looks like it doesn't use the index created and does a sequential scan:
Seq Scan on animals_211 e (cost=0.00..121.70 rows=1 width=327) |
Filter: ((location = 32341) AND ((type)::text = 'cat'::text) AND (((data ->> 'tagid'::text))::uuid = '5e54c1d9-3ea0-4bca-81d6-1000d90cc42c'::uuid
UPDATE 2 (not using partial index)
It somehow seems to be the partial index, as without it - it seems to work:
CREATE INDEX tag_id_index ON animals_211 (location, type, ((data->>'tagid')::uuid))
When I do an explain plan:
Index Scan using tag_id_index on animals_211 e (cost=0.28..8.30 rows=1 width=327)
Index Cond: ((location = 32341) AND ((type)::text = 'cat'::text) AND (((data ->> 'tagid'::text))::uuid = '5e54c1d9-3ea0-4bca-81d6-1000d90cc42c'::uuid))