4

I have table myTable with a JSONB column myJsonb with a data structure that I want to index like:

{
  "myArray": [
    {
      "subItem": {
        "email": "bar@bar.com"
      }
    },
    {
      "subItem": {
        "email": "foo@foo.com"
      }
    }
  ]
}

I want to run indexed queries on email like:

SELECT *
FROM mytable
WHERE 'foo@foo.com' IN (
  SELECT lower(
      jsonb_array_elements(myjsonb -> 'myArray')
      -> 'subItem'
      ->> 'email'
  )
);

How do I create a Postgres JSONB index for that?

JJ Zabkar
  • 3,792
  • 7
  • 45
  • 65

1 Answers1

4

If you don't need the lower() in there, the query can be simple and efficient:

SELECT *
FROM   mytable
WHERE  myjsonb -> 'myArray' @> '[{"subItem": {"email": "foo@foo.com"}}]'

Supported by a jsonb_path_ops index:

CREATE INDEX mytable_myjsonb_gin_idx ON mytable
USING  gin ((myjsonb -> 'myArray') jsonb_path_ops);

But the match is case-sensitive.

Case-insensitive!

If you need the search to match disregarding case, things get more complex.

You could use this query, similar to your original:

SELECT *
FROM   t
WHERE  EXISTS (
   SELECT 1
   FROM   jsonb_array_elements(myjsonb -> 'myArray') arr
   WHERE  lower(arr #>>'{subItem, email}') = 'foo@foo.com'
   );

But I can't think of a good way to use an index for this.

Instead, I would use an expression index based on a function extracting an array of lower-case emails:

Function:

CREATE OR REPLACE FUNCTION f_jsonb_arr_lower(_j jsonb, VARIADIC _path text[])
  RETURNS jsonb LANGUAGE sql IMMUTABLE AS
'SELECT jsonb_agg(lower(elem #>> _path)) FROM jsonb_array_elements(_j) elem';

Index:

CREATE INDEX mytable_email_arr_idx ON mytable
USING  gin (f_jsonb_arr_lower(myjsonb -> 'myArray', 'subItem', 'email') jsonb_path_ops);

Query:

SELECT *
FROM   mytable 
WHERE  f_jsonb_arr_lower(myjsonb -> 'myArray', 'subItem', 'email') @> '"foo@foo.com"';

While this works with an untyped string literal or with actual jsonb values, it stops working if you pass text or varchar (like in a prepared statement). Postgres does not know how to cast because the input is ambiguous. You need an explicit cast in this case:

... @> '"foo@foo.com"'::text::jsonb;

Or pass a simple string without enclosing double quotes and do the conversion to jsonb in Postgres:

... @> to_jsonb('foo@foo.com'::text);

Related, with more explanation:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Case-insensitivity is a requirement; the latter appears to work. One workaround I was considering to make the former solution workable was to add a normalized column, but I couldn't figure out the syntax for `jsonb_array_elements` in an UPDATE statement to backfill the data for more than 1 `subItem`. – JJ Zabkar Dec 18 '17 at 20:06
  • Hmm... I'm using JPA, but having trouble specifying the email as a parameter within the single-quotes... – JJ Zabkar Dec 18 '17 at 20:53
  • @JJZabkar: You took note of the double-quotes (required for JSON strings) inside the single quotes (denoting Postgres string literals)? So the passed string has to come with surrounding double-quotes. Alternatively, you could extract a Postgres array consisiting of plain text strings. Recent example: https://dba.stackexchange.com/a/193378/3684 – Erwin Brandstetter Dec 18 '17 at 21:17
  • 1
    Yes, took note, but I can't find the right syntax to parameterize it within a JPA nativeQuery. Since you've (quite elegantly) solved the main question here, I believe it's appropriate to move my secondary issue to a new thread: https://stackoverflow.com/questions/47877601/spring-data-jpa-jsonb-paramaterization – JJ Zabkar Dec 18 '17 at 22:52
  • Yes, fully agree. I think I may have an idea what's going wrong and appended some pointers above. – Erwin Brandstetter Dec 18 '17 at 23:42
  • Yup, you were spot-on with the casting. Noted solution on the other thread with a few other tweaks. Thanks! – JJ Zabkar Dec 19 '17 at 00:59