1

I have a Postgres 9.2 server with the following table:

Table A with a single column code and a B-tree index on it:

db=> \d A
Table "public.A"
Column  |  Type  | Modifiers 
--------+--------+-----------
 code   | bigint | not null

 Indexes:
 "A_pkey" PRIMARY KEY, btree (code)

I have a simple PLPGSQL function as follows, simplified for ease of following:

create or replace function list (bigint)
   RETURNS bigint[] AS '
    DECLARE
            arr        bigint[];
            c          ALIAS FOR $1;
    begin
            arr[0] = c * 1;
            arr[1] = c * 2;
            ...
            ...
            arr[10] = c * 1024;
            return arr;
END;
' language plpgsql;

I notice that the index is not used if invoke the function:

db => explain select * from A where code = ANY(list(3234234234));
                           QUERY PLAN                               
------------------------------------------------------------------------
Seq Scan on A (cost=0.00..1440291398.32 rows=10 width=219)
Filter: (code = ANY (list(3234234234::bigint)))

No luck even if I typecast the output of the function to be ANY(list(3234234234) :: bigint[])

Of course, if I manually create a list it works perfectly.

db=> explain select * from A where code = ANY( '{21312,13123,1312312,1231312,123213231}' :: bigint[]); 
                                   QUERY PLAN                                       
----------------------------------------------------------------------------------------
Bitmap Heap Scan on A  (cost=538.09..558.17 rows=5 width=219)
Recheck Cond: (code = ANY ('{21312,13123,1312312,1231312,123213231}'::bigint[]))
    ->  Bitmap Index Scan on A_pkey  (cost=0.00..538.09 rows=5 width=0)
        Index Cond: (code = ANY ('{21312,13123,1312312,1231312,123213231}'::bigint[]))

Why does Postgres do a sequential scan when using the function?
How do I get it to do index scan with the function?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Do you actually need a non-standard 0-based array index or is a default index starting with 1 just as well? – Erwin Brandstetter Nov 09 '14 at 01:39
  • The solution is to rewrite it as suggested by @ErwinBrandstetter. Rewrite it as an immutable function. The query starts using the index. – user3000172 Nov 09 '14 at 02:12
  • 1
    @ErwinBrandstetter It was a typo to use an index that started at [1]. Thanks for writing such a nice post that I learned so much from! – user3000172 Nov 09 '14 at 02:35

1 Answers1

0

Why is the index not used?

The function returns 11 bigint numbers, but Postgres does not really know that, since the PL/pgSQL function is a black box for the query planner, and it's VOLATILE, too. That tells Postgres to expect a different array for ever call. Postgres reckons it'll be cheaper to hit the table directly with a possibly big number of elements in the array. The expectation is different for the alternative with a fixed number of just 5 elements.

Improved function

Use this simple, IMMUTABLE SQL function instead:

CREATE OR REPLACE function list (c bigint)
  RETURNS bigint[] LANGUAGE sql IMMUTABLE AS
'SELECT array_agg($1 * (2^g)::int) FROM generate_series(0, 10) g';

Simpler and faster in several ways, and much rather allows index usage.

  • Lots of assignments are comparatively expensive in PL/pgSQL, even more so for assembling long arrays, since the whole array has to be rewritten with every assignment. BTW, the correct assignment operator in PL/pgSQL is := not =:

  • Default function volatility is VOLATILE. Your function can be IMMUTABLE: the same output for the same input is guaranteed at all times. Can help to optimize repeated calls.

  • You whole syntax was outdated and you don't need PL/pgSQL at all. Simple SQL functions can be inlined in the context of outer queries.

  • Finally, PL/pgSQL is not always a good choice for functions that are nested in the context of bigger queries. They are black boxes for the query planner and act as optimization barriers. Details:

SQL Fiddle.

Postgres arrays index starts with 1 by default. If you actually need your arrays to start with index 0, you could use the technique from this related answer to shift it (but I don't expect you need it):

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks a lot but this is not due to the size of the list returned by the function. Even when I reduced the function to return a list with just 2 elements, it still performs a sequential scan. Looking fwd to looking at your improved function though. – user3000172 Nov 09 '14 at 01:50
  • Also, the function returns just 10 elements not 1024 as you mention in your response. – user3000172 Nov 09 '14 at 01:53
  • @user3000172: Ah, I see now: `arr[10]`. Sorry, overlooked that bit. Updated my answer. The gist of it remains the same. – Erwin Brandstetter Nov 09 '14 at 02:05
  • That worked out. When I rewrite the function using the IMMUTABLE clause it worked out. Thank you much! – user3000172 Nov 09 '14 at 02:10
  • Also note the updated fiddle now. And on the topic of array elements: 0 .. 10 - that's 11 elements. ;) Ah, and did you test with an IMMUTABLE PLPGSQL function or my SQL variant? – Erwin Brandstetter Nov 09 '14 at 02:15