3

I need to speedup this query:

SELECT * FROM mytable 
WHERE 'value' = ANY("citext_array_col") LIMIT 1;

where citext_array_col is an array of citext. I tried to create an operator class:

CREATE OPERATOR CLASS gin__citext_ops
FOR TYPE citext[] USING gin
AS

        OPERATOR        6       = (anyarray, anyarray),
        FUNCTION        1       citext_cmp (citext, citext),
        FUNCTION        2       ginarrayextract(anyarray, internal, internal),
        FUNCTION        3       ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
        FUNCTION        4       ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),
        STORAGE         citext;

I can create the GIN index with this operator class, but it's useless (with set enable_seqscan = off the planner still uses the sequential scan). I have no idea what ginqueryarrayextract() & co. do, there's no documentation about this.

What i found is intarray extension of GIN index, but the code is in C, and i'm not too familiar with PG C extensions...

Is there a smarter way to create an index for this query? Maybe using text support functions?

vad
  • 1,196
  • 9
  • 22

3 Answers3

5

The operator class for CIText (adapted from PostgreSQL GIN index on array of uuid as suggested by Florent Guillaume) is as follows:

CREATE OPERATOR CLASS _citext_ops DEFAULT 
  FOR TYPE _citext USING gin AS 
  OPERATOR 1 &&(anyarray, anyarray), 
  OPERATOR 2 @>(anyarray, anyarray), 
  OPERATOR 3 <@(anyarray, anyarray), 
  OPERATOR 4 =(anyarray, anyarray), 
  FUNCTION 1 citext_cmp(citext, citext),
  FUNCTION 2 ginarrayextract(anyarray, internal, internal), 
  FUNCTION 3 ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal), 
  FUNCTION 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal), 
  STORAGE citext;

Also the query needs to be modified as for eeeebbbbrrrr's suggestion:

SELECT * FROM mytable 
WHERE citext_array_col && ARRAY['value']::citext[];
Community
  • 1
  • 1
Roberto
  • 121
  • 1
  • 6
2

PostgreSQL GIN index on array of uuid gives an answer for uuid which can likely be adapted to citext.

Community
  • 1
  • 1
Florent Guillaume
  • 8,243
  • 1
  • 24
  • 25
0

I believe what you want is to declare the operator class like so:

     CREATE OPERATOR CLASS _citext_ops DEFAULT FOR TYPE citext[] USING gin
AS

        OPERATOR        3       && (anyarray, anyarray),
        OPERATOR        6       = (anyarray, anyarray),
        OPERATOR        7       @> (anyarray, anyarray),
        OPERATOR        8       <@ (anyarray, anyarray),
        FUNCTION        1       citext_cmp (citext, citext),
        FUNCTION        2       ginarrayextract(anyarray, internal, internal),
        FUNCTION        3       ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
        FUNCTION        4       ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),
        STORAGE         citext;

And then your query would take a slightly different form:

SELECT * FROM mytable 
WHERE citext_array_col && ARRAY['value']::citext[];

Also ensure you've created the proper index using gin:

CREATE INDEX idxfoo ON mytable USING gin (citext_array_col);

That'll find all rows where the citext_array_col contains "value" in any element. The "&&" operator is the overlap operator. See http://www.postgresql.org/docs/9.3/static/functions-array.html

(edit: type-o)

  • FYI this won't work as is. The numbers after OPERATOR are the strategy, and they are wrong for GIN which uses only strategies 1-4. See http://www.postgresql.org/docs/9.2/static/xindex.html#XINDEX-STRATEGIES It appears that the strategies listed are those for GiST. – Florent Guillaume Nov 13 '13 at 16:15
  • this solved my problem, together with the uuid answer. Thanks a lot! – vad Nov 25 '13 at 22:04