1

I would like to use a plpgsql function with a table and several columns as input parameter. The idea is to split the table in chunks and do something with each part.

I tried the following function:

CREATE OR REPLACE FUNCTION my_func(Integer)
  RETURNS SETOF my_part
AS $$
DECLARE
out my_part;
BEGIN
  FOR i IN 0..$1 LOOP
    FOR out IN
    SELECT * FROM my_func2(SELECT * FROM table1 WHERE id = i)
    LOOP
       RETURN NEXT out;
   END LOOP;
  END LOOP;
  RETURN;
END;
$$
LANGUAGE plpgsql;

my_func2() is the function that does some work on each smaller part.

CREATE or REPLACE FUNCTION my_func2(table1) 
  RETURNS SETOF my_part2 AS
$$ 
BEGIN
RETURN QUERY
SELECT * FROM table1;
END
$$
LANGUAGE plpgsql;

If I run:

SELECT * FROM my_func(99);

I guess I should receive the first 99 IDs processed for each id. But it says there is an error for the following line:

SELECT * FROM my_func2(select * from table1 where id = i)

The error is:

The subquery is only allowed to return one column

Why does this happen? Is there an easy way to fix this?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • [CURSORs](http://www.postgresql.org/docs/current/static/plpgsql-cursors.html) should help. – Daniel Vérité Jan 08 '15 at 15:58
  • *Always* include your version of Postgres. And please be more specific. Add definitions of the row types you use. Clarify what "parts" are supposed to be: a single row at a time, or whole derived tables consisting of many rows. – Erwin Brandstetter Jan 09 '15 at 04:01
  • if I knew exactly what you were trying to accomplish, I could give you some better ideas... however, remember: several people have told you that a subquery can only return 1 column.. that's true, but you can always do something like (select row_to_json(tab1.*)... to convert the entire list of columns to 1 json column. – Joe Love Jan 09 '15 at 07:40

2 Answers2

7

There are multiple misconceptions here. Study the basics before you try advanced magic.

  • Postgres does not have "table variables". You can only pass 1 column or row at a time to a function. Use a temporary table or a refcursor (like commented by @Daniel) to pass a whole table. The syntax is invalid in multiple places, so it's unclear whether that's what you are actually trying.
    Even if it is: it would probably be better to process one row at a time or rethink your approach and use a set-based operation (plain SQL) instead of passing cursors.

  • The data types my_part and my_part2 are undefined in your question. May be a shortcoming of the question or a problem in the test case.

  • You seem to expect that the table name table1 in the function body of my_func2() refers to the function parameter of the same (type!) name, but this is fundamentally wrong in at least two ways:

    1. You can only pass values. A table name is an identifier, not a value. You would need to build a query string dynamically and execute it with EXECUTE in a plpgsql function. Try a search, many related answers her on SO. Then again, that may also not be what you wanted.

    2. table1 in CREATE or REPLACE FUNCTION my_func2(table1) is a type name, not a parameter name. It means your function expects a value of the type table1. Obviously, you have a table of the same name, so it's supposed to be the associated row type.

  • The RETURN type of my_func2() must match what you actually return. Since you are returning SELECT * FROM table1, make that RETURNS SETOF table1.

  • It can just be a simple SQL function.

All of that put together:

CREATE or REPLACE FUNCTION my_func2(_row table1) 
  RETURNS SETOF table1 AS
'SELECT ($1).*' LANGUAGE sql;

Note the parentheses, which are essential for decomposing a row type. Per documentation:

The parentheses are required here to show that compositecol is a column name not a table name

But there is more ...

  • Don't use out as variable name, it's a keyword of the CREATE FUNCTION statement.

  • The syntax of your main query my_func() is more like psudo-code. Too much doesn't add up.

Proof of concept

Demo table:

CREATE TABLE table1(table1_id serial PRIMARY KEY, txt text);
INSERT INTO table1(txt) VALUES ('a'),('b'),('c'),('d'),('e'),('f'),('g');

Helper function:

CREATE or REPLACE FUNCTION my_func2(_row table1) 
  RETURNS SETOF table1 AS
'SELECT ($1).*' LANGUAGE sql;

Main function:

CREATE OR REPLACE FUNCTION my_func(int)
  RETURNS SETOF table1 AS
$func$
DECLARE
   rec table1;
BEGIN
  FOR i IN 0..$1 LOOP
     FOR rec IN
        SELECT * FROM table1 WHERE table1_id = i
     LOOP
        RETURN QUERY
        SELECT * FROM my_func2(rec);
     END LOOP;
  END LOOP;
END
$func$  LANGUAGE plpgsql;

Call:

SELECT * FROM my_func(99);

SQL Fiddle.

But it's really just a a proof of concept. Nothing useful, yet.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

As the error log is telling you.. you can return only one column in a subquery, so you have to change it to

SELECT my_func2(SELECT Specific_column_you_need FROM hasval WHERE wid = i)

a possible solution can be that you pass to funct2 the primary key of the table your funct2 needs and then you can obtain the whole table by making the SELECT * inside the function

fredmaggiowski
  • 2,232
  • 3
  • 25
  • 44
  • If I got it right you need to manipulate the whole table hashval... right? Then You can invoke the funct2 passing i (your wid) Then in funct2 you do the SELECT * FROM hashval WHERE wid = $1 (parameter i passed in funct2) I'd try with something like that...I've not used plpgsql in a while so I don't remember the right syntax, sorry :( – fredmaggiowski Jan 08 '15 at 14:19
  • actually that is not what i want, i need to pass only parts of it. If i pass the whole table and do the where clause in func2, then func2 is not different from func1. The table has billions of rows and needs like hundreds of gb of ram, because i have to use complicated joins on the billions of rows, so my idea was to process each part seperately ;). – Christian Einstein Jan 08 '15 at 14:29