1

The multirow subselect will be used in the right hand side of the in operator in the where clause:

create table t (a integer);
insert into t (a) values (1), (9);

drop function if exists f();

create function f()
returns void as $$
begin
execute '
    select a
    from t
    where a in $1
' using (select 1 union select 2);
end;$$
language plpgsql;

select f();

ERROR:  more than one row returned by a subquery used as an expression
CONTEXT:  SQL statement "SELECT (select 1 union select 2)"
PL/pgSQL function "f" line 3 at EXECUTE statement

How to achieve what the above function would if it worked?

Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260

2 Answers2

3

I see nothing in your question that couldn't more easily be solved with:

SELECT a
FROM   t
JOIN  (VALUES (1), (2)) x(a) USING (a); -- any query returning multiple int

Can you clarify the necessity in your example?


As a proof of concept, this would be simpler / faster:

CREATE OR REPLACE FUNCTION x.f1()
  RETURNS SETOF integer AS
$BODY$
BEGIN
RETURN QUERY EXECUTE '
    SELECT a
    FROM t
    WHERE a = ANY($1)'
USING ARRAY(VALUES (1), (2)); -- any query here
END;
$BODY$
LANGUAGE plpgsql;

Performance of IN () and = ANY()

Your observation is spot on. And there is reason for it. Try:

EXPLAIN ANALYZE SELECT * FROM tbl WHERE id IN (1,2,3);

The query plan will reveal:

Index Cond: (id = ANY ('{1,2,3}'::integer[]))

PostgreSQL transforms the id IN (..) construct into id = ANY(..) internally. These two perform identically - except for a negligible overhead penalty.

My code is only faster in building the statement - exactly as you diagnosed.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Yes simpler and probably faster at building the query (I'm accepting it) but the execution plan cost is exactly the same as my own answer (`=any` vs `in`). The necessity is that the table is partitioned and if the `where` parameters are not constants the planner will not optimize the query. – Clodoaldo Neto Mar 21 '12 at 21:29
  • @Clodoaldo: I added a bit concerning your observation about performance. – Erwin Brandstetter Mar 21 '12 at 21:45
  • @Clodoaldo: You could take something like my simple query and run it with `EXECUTE`. Forces re-planning in any case. A `JOIN` will be faster than an `IN ()` expression, especially with longer lists of values. – Erwin Brandstetter Mar 21 '12 at 21:56
  • No I mean what I wrote. Check [one of the last paragraphs](http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html#DDL-PARTITIONING-CAVEATS) _http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html#DDL-PARTITIONING-CAVEATS_ I have experienced sluggish queries turn into blazing fast ones when the `where` or `join` parameters were turned into constants. Oh you just deleted your comment. I guess you checked the manual or something. No re-planning will avoid this planner behavior when the condition involves a partitioned table. – Clodoaldo Neto Mar 21 '12 at 22:05
  • @Clodoaldo: Yeah, I removed the first sentence, right after I had typed it because I had it backwards. – Erwin Brandstetter Mar 21 '12 at 22:48
0
create function f()
returns setof integer as $$
begin
return query execute format('
    select a
    from t
    where a in %s
', replace(replace(array(select 1 union select 2)::text, '{', '('), '}', ')'));
end$$
language plpgsql;
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260