42

I wrote a function that outputs a PostgreSQL SELECT query well formed in text form. Now I don't want to output a text anymore, but actually run the generated SELECT statement against the database and return the result - just like the query itself would.

What I have so far:

CREATE OR REPLACE FUNCTION data_of(integer)
  RETURNS text AS
$BODY$
DECLARE
   sensors varchar(100);   -- holds list of column names
   type    varchar(100);   -- holds name of table
   result  text;           -- holds SQL query
       -- declare more variables

BEGIN
      -- do some crazy stuff

      result := 'SELECT\r\nDatahora,' || sensors ||
      '\r\n\r\nFROM\r\n' || type ||
      '\r\n\r\nWHERE\r\id=' || $1 ||'\r\n\r\nORDER BY Datahora;';

      RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION data_of(integer) OWNER TO postgres;

sensors holds the list of column names for the table type. Those are declared and filled in the course of the function. Eventually, they hold values like:

  • sensors: 'column1, column2, column3'
    Except for Datahora (timestamp) all columns are of type double precision.

  • type :'myTable'
    Can be the name of one of four tables. Each has different columns, except for the common column Datahora.

Definition of the underlying tables.

The variable sensors will hold all columns displayed here for the corresponding table in type. For example: If type is pcdmet then sensors will be 'datahora,dirvento,precipitacao,pressaoatm,radsolacum,tempar,umidrel,velvento'

The variables are used to build a SELECT statement that is stored in result. Like:

SELECT Datahora, column1, column2, column3
FROM   myTable
WHERE  id=20
ORDER  BY Datahora;

Right now, my function returns this statement as text. I copy-paste and execute it in pgAdmin or via psql. I want to automate this, run the query automatically and return the result. How can I do that?

Community
  • 1
  • 1
waldyr.ar
  • 14,424
  • 6
  • 33
  • 64

4 Answers4

104

Dynamic SQL and RETURN type

(I saved the best for last, keep reading!)
You want to execute dynamic SQL. In principal, that's simple in plpgsql with the help of EXECUTE. You don't need a cursor. In fact, most of the time you are better off without explicit cursors.

The problem you run into: you want to return records of yet undefined type. A function needs to declare its return type in the RETURNS clause (or with OUT or INOUT parameters). In your case you would have to fall back to anonymous records, because number, names and types of returned columns vary. Like:

CREATE FUNCTION data_of(integer)
  RETURNS SETOF record AS ...

However, this is not particularly useful. You have to provide a column definition list with every call. Like:

SELECT * FROM data_of(17)
AS foo (colum_name1 integer
      , colum_name2 text
      , colum_name3 real);

But how would you even do this, when you don't know the columns beforehand?
You could use less structured document data types like json, jsonb, hstore or xml. See:

But, for the purpose of this question, let's assume you want to return individual, correctly typed and named columns as much as possible.

Simple solution with fixed return type

The column datahora seems to be a given, I'll assume data type timestamp and that there are always two more columns with varying name and data type.

Names we'll abandon in favor of generic names in the return type.
Types we'll abandon, too, and cast all to text since every data type can be cast to text.

CREATE OR REPLACE FUNCTION data_of(_id integer)
  RETURNS TABLE (datahora timestamp, col2 text, col3 text)
  LANGUAGE plpgsql AS
$func$
DECLARE
   _sensors text := 'col1::text, col2::text';  -- cast each col to text
   _type    text := 'foo';
BEGIN
   RETURN QUERY EXECUTE '
      SELECT datahora, ' || _sensors || '
      FROM   ' || quote_ident(_type) || '
      WHERE  id = $1
      ORDER  BY datahora'
   USING  _id;
END
$func$;

The variables _sensors and _type could be input parameters instead.

Note the RETURNS TABLE clause.

Note the use of RETURN QUERY EXECUTE. That is one of the more elegant ways to return rows from a dynamic query.

I use a name for the function parameter, just to make the USING clause of RETURN QUERY EXECUTE less confusing. $1 in the SQL-string does not refer to the function parameter but to the value passed with the USING clause. (Both happen to be $1 in their respective scope in this simple example.)

Note the example value for _sensors: each column is cast to type text.

This kind of code is very vulnerable to SQL injection. I use quote_ident() to protect against it. Lumping together a couple of column names in the variable _sensors prevents the use of quote_ident() (and is typically a bad idea!). Ensure that no bad stuff can be in there some other way, for instance by individually running the column names through quote_ident() instead. A VARIADIC parameter comes to mind ...

Simpler since PostgreSQL 9.1

With version 9.1 or later you can use format() to further simplify:

RETURN QUERY EXECUTE format('
   SELECT datahora, %s  -- identifier passed as unescaped string
   FROM   %I            -- assuming the name is provided by user
   WHERE  id = $1
   ORDER  BY datahora'
  ,_sensors, _type)
USING  _id;

Again, individual column names could be escaped properly and would be the clean way.

Variable number of columns sharing the same type

After your question updates it looks like your return type has

  • a variable number of columns
  • but all columns of the same type double precision (alias float8)

Use an ARRAY type in this case to nest a variable number of values. Additionally, I return an array with column names:

CREATE OR REPLACE FUNCTION data_of(_id integer)
  RETURNS TABLE (datahora timestamp, names text[], values float8[])
  LANGUAGE plpgsql AS
$func$
DECLARE
   _sensors text := 'col1, col2, col3';  -- plain list of column names
   _type    text := 'foo';
BEGIN
   RETURN QUERY EXECUTE format('
      SELECT datahora
           , string_to_array($1)  -- AS names
           , ARRAY[%s]            -- AS values
      FROM   %s
      WHERE  id = $2
      ORDER  BY datahora'
    , _sensors, _type)
   USING  _sensors, _id;
END
$func$;

Various complete table types

To actually return all columns of a table, there is a simple, powerful solution using a polymorphic type:

CREATE OR REPLACE FUNCTION data_of(_tbl_type anyelement, _id int)
  RETURNS SETOF anyelement
  LANGUAGE plpgsql AS
$func$
BEGIN
   RETURN QUERY EXECUTE format('
      SELECT *
      FROM   %s  -- pg_typeof returns regtype, quoted automatically
      WHERE  id = $1
      ORDER  BY datahora'
    , pg_typeof(_tbl_type))
   USING  _id;
END
$func$;

Call (important!):

SELECT * FROM data_of(NULL::pcdmet, 17);

Replace pcdmet in the call with any other table name.

How does this work?

anyelement is a pseudo data type, a polymorphic type, a placeholder for any non-array data type. All occurrences of anyelement in the function evaluate to the same type provided at run time. By supplying a value of a defined type as argument to the function, we implicitly define the return type.

PostgreSQL automatically defines a row type (a composite data type) for every table created, so there is a well defined type for every table. This includes temporary tables, which is convenient for ad-hoc use.

Any type can be NULL. Hand in a NULL value, cast to the table type: NULL::pcdmet.

Now the function returns a well-defined row type and we can use SELECT * FROM data_of() to decompose the row and get individual columns.

pg_typeof(_tbl_type) returns the name of the table as object identifier type regtype. When automatically converted to text, identifiers are automatically double-quoted and schema-qualified if needed, defending against SQL injection automatically. This can even deal with schema-qualified table-names where quote_ident() would fail. See:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    Thank you very much! All this help is priceless! And sure it solve the problem! – waldyr.ar Aug 02 '12 at 00:13
  • 1
    Great answer, but can't help feeling the first two solutions were just foreplay to the final very cool one ;-) – beldaz Feb 18 '15 at 02:50
  • Can the final solution be wrapped in a function that takes the textual name of the relation as an argument? I've encountered this problem a few times where I have web applications with functions like `getData(String relationName, Int rowId)` and I'd like to implement by calling something like the above, but there's the danger of SQL Injection on `relationName`. – beldaz Feb 18 '15 at 03:29
  • @beldaz: Please start a new question for this question. Comments are not the place. You can always add a link to this answer for context. – Erwin Brandstetter Feb 18 '15 at 03:43
  • Is there a way to get the name of the table you pass to _tbl_type as a string? –  May 29 '18 at 17:56
  • 1
    @BrianPreslopsky: `pg_typeof(_tbl_type)::text` – Erwin Brandstetter May 29 '18 at 21:46
  • @Erwin Brandstetter: Thank you. –  May 30 '18 at 15:39
  • `pg_typeof` seems does not work for TEMP TABLE. Is this expected behavior? – Eugen Konkov Jun 27 '20 at 18:32
  • @EugenKonkov: `pg_typeof()` works for temporary tables just as well. (Temp tables are listed in system catalog tables just like regular table.) See: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=d5a557c6e0f06db1686dc4ff1cf85f33. Maybe you messed up your [`search_path`](https://stackoverflow.com/a/9067777/939860)? Or the syntax of the function call? – Erwin Brandstetter Jun 27 '20 at 23:55
  • @ErwinBrandstetter: Yes, I did =(. Thank for hint. I miss `null::table_name` part – Eugen Konkov Jun 29 '20 at 17:39
  • I know it's late but I can't seem to figure out how to pass in a string as the _tbl_type. I keep getting a syntax error at the call of data_of(5, NULL::'transactions'), which is at or near 'transactions'. Anyone know how to solve this? the function works great with using transactions instead of a string, but I can't call it with the table name. I posted a question about it here: [link](https://stackoverflow.com/questions/65515168/pl-postgresql-how-to-convert-a-variable-into-a-table-name?noredirect=1#comment115830557_65515168) – Jabinator1 Dec 31 '20 at 17:33
  • @Jabi `data_of(5, NULL::transactions)` - no single quotes. See: https://stackoverflow.com/a/20880247/939860 – Erwin Brandstetter Dec 31 '20 at 17:46
  • @ErwinBrandstetter is there a way to convert a string into the table name? the data is being passed from a Node server and so it's coming in as a string ('transactions') hence my issue- I've also looked into replacing the ' ' apostrophes but I couldn't find a way to do that outside of a function – Jabinator1 Dec 31 '20 at 17:58
  • 1
    @Jabinator1: You can produce the table name from a string inside the plpgsql function, **but** you can't get the dynamic return type this way, as this is determined by passing *in* the type. IOW: SQL forces you to declare the return type at call time at the latest. – Erwin Brandstetter Jan 02 '21 at 15:30
  • @ErwinBrandstetter Luckily I ended up figuring it out- Because I'm connecting to my server with MassiveJS I was able to do inline SQL in my server, which shortened it to one line of code! haha but thanks for the response! – Jabinator1 Jan 03 '21 at 00:48
  • @ErwinBrandstetter. first time see `NULL::pcdmet`. the manual too much `null`. Can you drop a manual reference link. Thanks. – jian Oct 15 '21 at 05:38
  • How would one adapt the final answer to say pick a table dynamically? I.e. I have the table I need to reference 'aqe_idarea' in 2 different columns, how would I put it into the first argument of the function? – Timothyjames67 Jul 12 '22 at 11:37
  • @Timothyjames67: Ask a new *question* with the necessary detail to make your case clear. You can always link to this one for context ... – Erwin Brandstetter Jul 13 '22 at 01:27
  • @ErwinBrandstetter Sounds good thank you! I will write that up :) – Timothyjames67 Jul 14 '22 at 11:32
4

You'll probably want to return a cursor. Try something like this (I haven't tried it):

CREATE OR REPLACE FUNCTION data_of(integer)
  RETURNS refcursor AS
$BODY$
DECLARE
      --Declaring variables
      ref refcursor;
BEGIN
      -- make sure `sensors`, `type`, $1 variable has valid value
      OPEN ref FOR 'SELECT Datahora,' || sensors ||
      ' FROM ' || type ||
      ' WHERE nomepcd=' || $1 ||' ORDER BY Datahora;';
      RETURN ref;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION data_of(integer) OWNER TO postgres;
bpgergo
  • 15,669
  • 5
  • 44
  • 68
  • It doesn't work. It returns a column named refcursor, and a line with ""! I'm not a PL/PgSQL's guru, but I'm not silly. It was my first attempt. BTW thanks for your try! :) – waldyr.ar Jul 31 '12 at 13:10
  • did you FETCH from the returned cursor? like `FETCH ref INTO target;` Cunsult the cursor docs for details on how to use a cursor. http://www.postgresql.org/docs/8.1/static/plpgsql-cursors.html – bpgergo Jul 31 '12 at 13:14
  • Amazing, very elegant and powerful approach. Works well with PgAdmin but NOT with PSQLODBC driver and ADODB. In PgAdmin we have to add explicit fetch all in to get it working. – Anatoly Alekseev May 31 '18 at 00:38
  • How would one use it? `FETCH data_of(1)`? I keep getting a syntax error there – Vinicius Dantas Jun 25 '19 at 16:30
1

I'm sorry to say but your question is very unclear. However below you'll find a self contained example how to create and use a function that returns a cursor variable. Hope it helps !

begin;

create table test (id serial, data1 text, data2 text);

insert into test(data1, data2) values('one', 'un');
insert into test(data1, data2) values('two', 'deux');
insert into test(data1, data2) values('three', 'trois');

create function generate_query(query_name refcursor, columns text[])
returns refcursor 
as $$
begin
  open query_name for execute 
    'select id, ' || array_to_string(columns, ',') || ' from test order by id';
  return query_name;
end;
$$ language plpgsql;

select generate_query('english', array['data1']);
fetch all in english;

select generate_query('french', array['data2']);
fetch all in french;
move absolute 0 from french; -- do it again !
fetch all in french;

select generate_query('all_langs', array['data1','data2']);
fetch all in all_langs;

-- this will raise in runtime as there is no data3 column in the test table
select generate_query('broken', array['data3']);

rollback;
user272735
  • 10,473
  • 9
  • 65
  • 96
  • 1
    instead of saying that is an unclear question, tell me what is missing, or what you want to know. I'm sorry for not being clear, although I tried so hard to make a good question. And I can improve it with your consideration. – waldyr.ar Jul 31 '12 at 19:13
  • @waldyr.ar: you updated version is now much better. It was simply unclear what you wanted to achieve and some details were missing (what are those _sensors_ and _type_ and how do they get their values). It's usually helps a lot if a question includes the simplest possible working code (i.e. that can be executed by the others) that illustrates the problem. – user272735 Aug 01 '12 at 05:48
0
# copy paste me into bash shell directly
clear; IFS='' read -r -d '' sql_code << 'EOF_SQL_CODE'
CREATE OR REPLACE FUNCTION func_get_all_users_roles()
  -- define the return type of the result set as table
  -- those datatypes must match the ones in the src
  RETURNS TABLE (
                 id           bigint
               , email        varchar(200)
               , password     varchar(200)
               , roles        varchar(100)) AS
$func$
BEGIN
   RETURN QUERY 
   -- start the select clause
   SELECT users.id, users.email, users.password, roles.name as roles
   FROM user_roles
   LEFT JOIN roles ON (roles.guid = user_roles.roles_guid)
   LEFT JOIN users ON (users.guid = user_roles.users_guid)
   -- stop the select clause
;
END
$func$  LANGUAGE plpgsql;
EOF_SQL_CODE
# create the function
psql -d db_name -c "$sql_code"; 

# call the function 
psql -d db_name -c "select * from func_get_all_users_roles() "
Yordan Georgiev
  • 5,114
  • 1
  • 56
  • 53