I need some help with plpgsql function in postgres. This question has been asked on psql-general without conclusive answer (here and here)
This boils down to this :
how to create a plpgsql function with a set of rows as input and a set of rows as output without using array (for performance).
I currently transmit input as the ref of a cursor but it is not satisfactory because it forces to compute twice the same query (SELECT ...hard_work...
) AND it messes transactions in my application.
Currently it works like :
DECLARE cursor FOR ...hardwork...;
WITH first_query AS (
SELECT ...hard_work... --second computation of hard_work
),
second_query AS (
SELECT ...another_query_using_hard_work...
)
SELECT *
FROM my_function('cursor'::refcursor) f(...) ;
Ultimately I would like to have something like (not working)
WITH first_query AS (
SELECT ...hard_work...
),
second_query AS (
SELECT ...another_query_using_the_hard_work_query...
)
SELECT *
FROM my_function('first_query'::regclass) f(...) ;
Of course SELECT ...hard_work... is expensive (around 50ms) and it would be better not to compute it twice. The application is data streaming, so time is precious, and data are heavy, so copying data in temp table may be worse than computing twice (typically tens of MB ).
Other solutions were proposed
transmitting a view or a table ref as input, which is the same problem as using a cursor (ie : twice the computing, separate statement)
transmitting/outputting array : forces a lot of computing with array_agg() and unnest()
transmitting a temp table : involves copying data: It might be longer than computing twice (I'm fighting for 50 ms)
transmitting a materialized view : only available in 9.3
I would be very grateful to have an advanced insight on the subject.
Best Regards,
Rémi-C
PS : link to questions on psql-general mailing list contains a lots of details about specific and even all the code.
PPS : version : postgres 9.2 . OS : Ubuntu 12.04 LTE, client : PGAdmin3 for test, node.js for prod.