19

How would you go about searching for an element inside an array stored in a json column? (Update: Also see the 9.4 updated answer for jsonb columns.)

If I have a JSON document like this, stored in a json column named blob:

{"name": "Wolf",
 "ids": [185603363281305602,185603363289694211]}

what I'd like to be able to do is something like:

SELECT * from "mytable" WHERE 185603363289694211 = ANY("blob"->'ids');

and get all matching rows out. But this doesn't work because "blob"->'ids' returns JSON values, not Postgres arrays.

I'd also like to build an index on the individual IDs, if that's possible.

Joe Shaw
  • 22,066
  • 16
  • 70
  • 92

3 Answers3

14

The following original answer applies only for Postgres 9.3. For a Postgres 9.4 answer, see the Update below.

This builds on Erwin's referenced answers, but is a little bit more explicit to this question.

The IDs in this case are bigints, so create a helper function for converting a JSON array to a Postgres bigint array:

CREATE OR REPLACE FUNCTION json_array_bigint(_j json)
  RETURNS bigint[] AS
$$
SELECT array_agg(elem::text::bigint)
FROM json_array_elements(_j) AS elem
$$
  LANGUAGE sql IMMUTABLE;

We could just have easily (and perhaps more re-usably) returned a text array here instead. I suspect indexing on bigint is a lot faster than text but I'm having a difficult time finding evidence online to back that up.

For building the index:

CREATE INDEX "myindex" ON "mytable" 
  USING GIN (json_array_bigint("blob"->'ids'));

For querying, this works and uses the index:

SELECT * FROM "mytable" 
  WHERE '{185603363289694211}' <@ json_array_bigint("blob"->'ids');

Doing this will also work for querying, but it doesn't use the index:

SELECT * FROM "mytable" 
  WHERE 185603363289694211 = ANY(json_array_bigint("blob"->'ids'));

Update for 9.4

Postgres 9.4 introduced the jsonb type. This is a good SO answer about jsonb and when you should use it over json. In short, if you're ever querying the JSON, you should use jsonb.

If you build your column as jsonb, you can use this query:

SELECT * FROM "mytable"
  WHERE blob @> '{"ids": [185603363289694211]}';

The @> is Postgres' contains operator, documented for jsonb here. Thanks to Alain's answer for bringing this to my attention.

Community
  • 1
  • 1
Joe Shaw
  • 22,066
  • 16
  • 70
  • 92
  • 1
    +1 Nice work. Note, if elements within the JSON arrays are not unique you might get duplicate rows in the result. And yes, processing `bigint` is generally faster than processing `text`. Also makes for a *smaller* index, but simplify `elem::text::bigint` to `elem::bigint`. BTW, no need for double-quotes around perfectly valid identifiers in Postgres. – Erwin Brandstetter Sep 16 '13 at 20:34
  • Thanks for the help! Unfortunately casting from `json` directly to `bigint` does not work. I have to pass it through `text` first. – Joe Shaw Sep 17 '13 at 13:30
  • `ANY(json_array_bigint("blob"->'ids');` --> there is a `)` missing at the end – vintagexav Feb 14 '15 at 03:54
5

First, try the operator ->> instead of -> to strip the JSON layer from the array value.

Next, the query can work like this:
How do I query using fields inside the new PostgreSQL JSON datatype?

And indexing might work like this:
Index for finding an element in a JSON array

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
4

I know it's been a while...

In postgresql-9.5, it's now possible to query it easily.

select '{"name": "Wolf",
         "ids": [185603363281305602,185603363289694211]}'::jsonb
       @> '{"ids":[185603363281305602]}'

I think you should use a jsonb field instead and you can index it afterward.

CREATE INDEX idx_gin_ids ON mytable USING gin ((blob -> 'ids'));
Alain Gilbert
  • 63
  • 2
  • 8
  • This seems to also work in Postgres 9.4. But the caveat is that `@>` only works on `jsonb` columns, not `json` columns. (That's probably fine, I can't think of situations in which you'd choose `json` over `jsonb`, but this question was originally asked for Postgres 9.3 which didn't yet have `jsonb`) – Joe Shaw May 15 '15 at 14:16
  • 1
    Also I think a better example would be `SELECT * from "mytable" WHERE blob @> '{"ids": [185603363281305602]}'`; – Joe Shaw May 15 '15 at 14:17