1

In my PostgreSQL 11.11 I have one jsonb column that holds objects like this:

{
   "dynamicFields":[
      {
         "name":"200",
         "hidden":false,
         "subfields":[
            {
               "name":"a",
               "value":"Subfield a"
            },
            {
               "name":"b",
               "value":"Subfield b"
            }           
         ]
      }
   ]
}

dynamicFields is an array and subfields is also an array and I having performance issues when hitting selects like this:

select *
from my_table a
cross join lateral jsonb_array_elements(jsonb_column -> 'dynamicFields') df
cross join lateral jsonb_array_elements(df -> 'subfields') sf
where df ->> 'name' = '200' and sf ->> 'name' = 'a'

The performance issues live mostly in the subfield. I have already added an index like this:

CREATE INDEX idx_my_index ON my_table USING gin ((marc->'dynamicFields') jsonb_path_ops);

How can I add an index for the subfields inside the dynamicFields?
The query above is just one example, I use it a lot in joins with other tables in the database. And I also know the @> operator.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
André Luiz
  • 6,642
  • 9
  • 55
  • 105

2 Answers2

6

You already have a very good index to support your query.
Make use of it with the jsonb "contains" operator" @>:

SELECT *
FROM   my_table
WHERE  marc->'dynamicFields' @> '[{"name": "200", "subfields":[{"name": "a"}]}]';

db<>fiddle here

Carefully match the structure of the JSON object in the table. Then rows are selected cheaply using the index.
You can then extract whatever parts you need from qualifying rows.

Detailed instructions:

If one of the filters is very selective on its own, it might be faster to split the two conditions like in your original. Either way, both variants should be fast:

SELECT *
FROM   my_table
WHERE  marc->'dynamicFields' @> '[{"name": "200"}]'
AND    marc->'dynamicFields' @> '[{"subfields":[{"name": "a"}]}]';
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • I would be interested how both queries compare in your setting. Can you report times for both, and your original? – Erwin Brandstetter Sep 23 '21 at 22:56
  • Thanks for the answer! I'll try to use these in the queries as much as possible and let you know. – André Luiz Sep 24 '21 at 13:04
  • 1
    I did the same query but adding the @> for some filtering in the where clause and it went 55 seconds to 60ms. – André Luiz Sep 24 '21 at 15:02
  • This is exactly what I've been trying to do, but I `EXPLAIN ANALYZE` does not tell me that the index is being used. I modified your fiddle to demo this: https://dbfiddle.uk/?rdbms=postgres_12&fiddle=a49d8f77d271b019af39aad95c354487 --what am I doing wrong here? – Manuel Kaufmann Aug 18 '22 at 13:51
  • @ManuelKaufmann: Postgres won't use indexes for such a tiny test table. Indexes only pay for bigger tables. You can force the issue for the tiny test with `SET enable_seqscan = off;` See: https://stackoverflow.com/a/14555618/939860. Modified fiddle: https://dbfiddle.uk/?rdbms=postgres_12&fiddle=6a37e5ff913c77c6068564f454ad174f (Or test with a much bigger table.) – Erwin Brandstetter Aug 19 '22 at 00:05
-2

Index is for enhancing query performance on tables. Index can only be done on table columns and considering those columns that will be used in table join and where clause makes the indexing significant. For a jsonb column you can use create index on table_name using gin(column_name, jsonb_path_ops).

john mba
  • 9
  • 3
  • 1
    Yes, you can use a GIN index, but you should have said how. An index can also be defined on an expression. The question is not about joins, so promulgating general ideas about indexing is probably not going to help solve the problem. – Laurenz Albe Sep 24 '21 at 04:17