2

I have a jsonb column which has the unfortunate case of being very unpredictable, in some cases its value may be an array with nested values:

["UserMailer", "applicant_setup_3", ["5cbffeb7-8d5e-4b52-a475-3cf320b2cee9"]]

Sometimes it will be something with key/values like this:

[{"reference_id": "5cbffeb7-8d5e-4b52-a475-3cf320b2cee9", "job_dictionary": ["StatusUpdater", "FollowTwitterUsersJob"]}]

Is there a way to write a query which just treats the whole column like text and does a like to see if I can find the uuid in the big text blob? I want to find all the records where a particular uuid string is present in the jsonb column.

The query doesn't need to be fast or efficient.

GMB
  • 216,147
  • 25
  • 84
  • 135
JP Silvashy
  • 46,977
  • 48
  • 149
  • 227
  • Will it always be UUID strings that you are looking for? Also, as always, please declare your version of Postgres, the rough number of rows in the table, the avg size of the JSON column and whether performance matters. – Erwin Brandstetter Sep 22 '19 at 01:55
  • Yes, it will always be a uuid type, no more than 100k rows in the table total. PG 11. – JP Silvashy Sep 22 '19 at 03:15

3 Answers3

3

Postgres has search operator ? for jsonb, but that would require you to search the json content recursively.

A possible, although not very efficient method, would to stringify the object and use LIKE to search it:

myjsonb::text LIKE '%"5cbffeb7-8d5e-4b52-a475-3cf320b2cee9"%'
myjsonb::text LIKE '%"' || myuuid || '"%'

Demo on DB Fiddle:

GMB
  • 216,147
  • 25
  • 84
  • 135
2

The problem with the jsonb operator ? is that it only considers top-level keys (including array elements), not values, and no nested objects.

You seem to be looking for values and array elements (not keys) on any level. You can get that with a full text search on top of your json(b) column:

SELECT * FROM tbl
WHERE  to_tsvector('simple', jsonb_column)
    @@ tsquery '5cbffeb7-8d5e-4b52-a475-3cf320b2cee9';

db<>fiddle here

to_tsvector() extracts values and array elements on all levels - just what you need.

Requires Postgres 10 or later. json(b)_to_tsvector() in Postgres 11 offers more flexibility.

That's attractive for tables of non-trivial size as it can be supported with a full text index very efficiently:

CREATE INDEX tbl_jsonb_column_fts_gin_idx ON tbl USING GIN (to_tsvector('simple', jsonb_column));

I use the 'simple' text search configuration in the example. You might want a language-specific one, like 'english'. Doesn't matter much while you only look for UUID strings, but stemming for a particular language might make the index a bit smaller ...

Related:

While you are only looking for UUIDs, you might optimize further with a custom (IMMUTABLE) function to extract UUIDs from the JSON document as array (uuid[]) and build a functional GIN index on top of it. (Considerably smaller index, yet.) Then:

SELECT * FROM tbl
WHERE  my_uuid_extractor(jsonb_column) @> '{5cbffeb7-8d5e-4b52-a475-3cf320b2cee9}';

Such a function can be expensive, but does not matter much with a functional index that stores and operates on pre-computed values.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

You can split the array elements first by using jsonb_array_elements(json), and then filter the casted string from those elements by like operator

select q.elm
  from
  (
    select jsonb_array_elements(js) as elm
      from tab
  ) q
 where elm::varchar like '%User%'

elm
----------------------------------------------------------------------------------------------------------------------
"UserMailer"
{"reference_id": "5cbffeb7-8d5e-4b52-a475-3cf320b2cee9", "job_dictionary": ["StatusUpdater", "FollowTwitterUsersJob"]}

Demo

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55