1

The past two days I've been reading a lot about jsonb, full text search, gin index, trigram index and what not but I still can not find a definitive or at least a good enough answer on how to fastly search if a row of type JSONB contains certain string as a value. Since it's a search functionality the behavior should be like that of ILIKE

What I have is:

  • Table, lets call it app.table_1 which contains a lot of columns one of which is of type JSONB, so lets call it column_jsonb

  • The data inside column_jsonb will always be flatten (no nested objects, etc) but the keys can vary. An example of the data in the column with obfuscated values looks like this: "{""Key1"": ""Value1"", ""Key2"": ""Value2"", ""Key3"": null, ""Key4"": ""Value4"", ""Key5"": ""Value5""}"

  • I have a GIN index for this column which doesn't seems to affect the search time significantly (I am testing with 20k records now which takes about 550ms). The indes looks like this:

    CREATE INDEX ix_table_1_column_jsonb_gin
      ON app.table_1 USING gin
      (column_jsonb jsonb_path_ops)
      TABLESPACE pg_default;
    
  • I am interested only in the VALUES and the way I am searching them now is this:

    EXISTS(SELECT value FROM jsonb_each(column_jsonb) WHERE value::text ILIKE search_term)
    

Here search_term is variable coming from the front end with the string that the user is searching for

I have the following questions:

  • Is it possible to make the check faster without modifying the data model? I've read that trigram index might be usfeul for similar cases but at least for me it seems that converting jsonb to text and then checking will be slower and actually I am not sure if the trigram index will actually work if the column original type is JSONB and I explicitly cast each row to text? If I'm wroing I would really appreciate some explanation with example if possible.

  • Is there some JSONB function that I am not aware of which offers what I am searching for out of the box, I'm constrained to PostgreSQL v 11.9 so some new things coming with version 12 are not available for me.

  • If it's not possible to achieve significant improvement with the current data structure can you propose a way to restructure the data in column_jsonb maybe another column of some other type with data persisted in some other way, I don't know...

Thank you very much in advance!

Leron
  • 9,546
  • 35
  • 156
  • 257
  • Do you specifically need a pattern search? For example, `WHERE value::text ILIKE 'key%'` – Schwern Oct 01 '20 at 22:36
  • Sadly - yes. But just to be clear - in the JSON key-value pairs, I am interesed if the **value** part contains the `search_term`. – Leron Oct 01 '20 at 22:38
  • I guess a trigram index on the value cast to `text` should work. Just make sure to use the same expression in your query condition as in your [expression index](https://www.postgresql.org/docs/current/indexes-expressional.html). The trigram search might not be exactly what you want but it might still speed up the query by providing a selective index. – Bergi Oct 01 '20 at 22:40
  • @Bergi I have a hard time to gradsp exctly how to create this trigram index. Can you give me answe with example how to correctly create the index and perfrom the search? I will really appreciate it! – Leron Oct 01 '20 at 22:43
  • 1
    Have a look at the near-duplicate [How to index a PostgreSQL JSONB flat text array for fuzzy and right-anchored searches?](https://stackoverflow.com/q/55500586/1048572) and also [Indexing an array for full text search](https://stackoverflow.com/q/31210790/1048572) or [How to index a string array column for pg_trgm `'term' % ANY (array_column)` query?](https://stackoverflow.com/q/39480580/1048572). Of course, you've got an object not array so ideally you'd want to remove the keys from the indexed value, but I don't know a pretty/non-hacky way to do that. – Bergi Oct 01 '20 at 22:52

1 Answers1

1

If the data structure is flat, and you regularly need to search the values, and the values are all the same type, a traditional key/value table would seem more appropriate.

create table table1_options (
  table1_id bigint not null references table1(id),
  key text not null,
  value text not null
);

create index table1_options_key on table1_options(key);
create index table1_options_value on table1_options(value);

select *
from table1_options
where value ilike 'some search%';

I've used simple B-Tree indexes, but you can use whatever you need to speed up your particular searches.

The downsides are that all values must have the same type (doesn't seem to be a problem here) and you need an extra table for each table. That last one can be mitigated somewhat with table inheritance.

Schwern
  • 153,029
  • 25
  • 195
  • 336
  • 1
    Thanks for your answer. But it seems I haven't been clear enough. Actually I am inerested only in the values and if the search_term is contained in any of them. Also I would really like to ask you - is it possible to create trigram index directly on JSONB column? – Leron Oct 01 '20 at 22:35
  • @Leron_says_get_back_Monica If you're regularly searching the values and don't know the key, you would probably be better off with a different data structure. The key/value table would work better as you can simply index the value. – Schwern Oct 01 '20 at 22:40
  • It's good to leave if for the others who will read this, but in my specific case all values are of the same type and this is something that is very unlikely to change so for me this is not an issue. – Leron Oct 01 '20 at 22:48
  • Thanks, this is really very useful for me and answers all of my questions. But I would ask you for a favour - I am really interested how to use correctly trigram index in this case. I've read about it but I still can not figure out quite well how to create the index correctly and perfrom the search. Can you give me an example of this. I would really appreciate it! – Leron Oct 01 '20 at 22:52
  • @Leron_says_get_back_Monica Sorry, I don't have much experience with text search and even less with jsonb + text search. My experience with jsonb is that it looks like a silver bullet, but it winds up complicates things, particularly indexing. If you don't need it, don't use it. – Schwern Oct 01 '20 at 22:57
  • @Leron_says_get_back_Monica The question I'd ask myself is why are these values in different keys if they're all related? Should they be in an array instead? Providing a more concrete example would help. – Schwern Oct 01 '20 at 22:59
  • No problem. Thank you very much anyways. It's mostly, just not very well thought DB structure - we are aware of that, that's why I am asking about better approacher. In general - we have a few vert complicated entities which have a common base structure and several important properties. But also there is a lot of additional data that the use would like to save and literally we are unable to predcit everything, and even if we did, this would mean to have tale with maybe hundred columns or more. I guess that's why the first choice was to use JSONB. – Leron Oct 01 '20 at 23:08
  • @Leron_says_get_back_Monica Yes, I've found jsonb to be a life saver in those situations.. And once a structure becomes apparent, start moving it into traditional tables. – Schwern Oct 02 '20 at 00:10