1

Lets say I have a table named static in two different databases houses and apartments.

static table contains the static information of houses such as house_size, no_rooms, pool, spa.

static table in houses database has columns like this:

pool    spa house_size  sauna   no_rooms
   1    1         25    1       2
   1    0         35    1       3

static table in apartments database has columns like this:

pool    spa house_size  sauna   
   1    1         25    1       
   1    0         35    1       

I want to run below query without raising any error. Currently, I get error as no_rooms column does not exists in apartments.public.static.

select pool, case when spa = 1 then 1 else 0 end as has_spa,
      sauna, house_size, case when no_rooms > 2 then 1 else 0 end as rooms 
from static;

Solution I tried:

WITH static_new AS (SELECT s.*     
FROM (SELECT 0 AS no_rooms) AS dummy 
LEFT JOIN LATERAL
( SELECT
      pool, spa, sauna, house_size, no_rooms
  FROM static
)  AS s on true)
SELECT * FROM static_new;

It works, but this query becomes messy when there are more columns involved.

What I am looking for:

  1. Create a function that takes column names and table name and then does the join which I am doing in above query and returns a table. (Should be generic and work for given column names and table name in parameter and return a table.)

  2. Any other nice and tidy solution?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Awaish Kumar
  • 537
  • 6
  • 22
  • Just fix the data by adding `no_rooms` to the table that doesn't have it. Or create a view in both databases with the same name where the value added. – Gordon Linoff Jul 01 '20 at 14:02

2 Answers2

2

SQL is a strictly typed language and Postgres functions must declare their return type. Returning a variable number of columns from a function is only possible with workarounds, like polymorphic types. See:

But we can't work with the row type in your case, as that varies from database to database. The remaining option: return anonymous records and provide a column definition list with every call. I don't usually recommend this, as providing a column definition list with every call can be tedious - and often pointless. But yours might be one of the rare use cases where it makes sense.

Still, you have to know the data type of possibly missing columns. I'll assume integer for the purpose of this demo. Else you have to pass data types additionally and build the query accordingly.

CREATE OR REPLACE FUNCTION f_dynamic_select(_tbl regclass
                                          , _cols VARIADIC text[])  -- ①
  RETURNS SETOF record     -- ② anonymous records
  LANGUAGE plpgsql AS
$func$
BEGIN
   RETURN QUERY EXECUTE    -- ③ dynamic SQL
   format(
      'SELECT %s FROM %s'  -- ④ safe against SQLi
    , (
      SELECT string_agg(COALESCE(quote_ident(a.attname)
                              , '0 AS ' || quote_ident(t.col)  -- assuming integer!
                                ), ', ' ORDER  BY t.ord) -- ⑤
      FROM   unnest(_cols) WITH ORDINALITY t(col, ord)   -- ⑤
      LEFT   JOIN pg_attribute a ON a.attrelid = _tbl    -- ⑥
                                AND a.attnum > 0
                                AND NOT a.attisdropped 
                                AND a.attname = t.col
      )
    , _tbl
   );
END
$func$;

Call (important!)

SELECT *
FROM   f_dynamic_select('static', 'pool', 'spa', 'sauna', 'house_size', 'no_rooms')
AS t(pool int, spa int, house_size int, sauna int, no_rooms int); -- ② column definition list

Your example call, with expressions based on these columns:

SELECT pool, case when spa = 1 then 1 else 0 end as has_spa  -- ⑦ expressions
     , sauna, house_size
     , case when no_rooms > 2 then 1 else 0 end as rooms 
FROM f_dynamic_select('static', 'pool', 'spa', 'sauna', 'house_size', 'no_rooms')
AS t(pool int, spa int, house_size int, sauna int, no_rooms int);

db<>fiddle here

① The function takes a table name as regclass type. See:

... followed by an arbitrary list of column names - in meaningful order. VARIADIC should be convenient for this. See:

Note that we pass column names as case-sensitive, single-quoted strings. Not (double-quoted) identifiers.

② This may be the first time ever I recommend returning anonymous records from a function - after close to 1000 answers on the [plpgsql] tag. The manual:

If the function has been defined as returning the record data type, then an alias or the key word AS must be present, followed by a column definition list in the form ( column_name data_type [, ... ]). The column definition list must match the actual number and types of columns returned by the function.

The manual on dynamic SQL.

④ Safe against SQL injection, because the table name is passed as regclass, and the SELECT list is concatenated using quote_ident() carefully. See:

⑤ Use WITH ORDINALITY to preserve original order of columns. See:

LEFT JOIN to the system catalog pg_attribute to identify existing columns. See:

⑦ Move expressions building on the passed columns to the outer SELECT.


Disclaimer: I would only introduce this level of sophistication if I had to. Maybe you can work with simple views in each database after all?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
0

Don't muck with the queries. Add a view to both databases with all the columns. In the first database:

create view v_static as
    select pool, spa house_size, sauna, no_rooms
    from status;

In the second:

create view v_static as
    select pool, spa house_size, sauna, null as no_rooms
    from status;

Then use the view instead of the base table.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • do you know how can I write a function that accepts column names as parameters and return a table? – Awaish Kumar Jul 01 '20 at 14:18
  • @AwaishKumar . . . That doesn't seem like a reasonable solution to the problem you have raised in this question. You can ask a *new* question if you want to do something like that. – Gordon Linoff Jul 01 '20 at 14:29
  • I have added in the last 2 things, I want to get from the question? 1) is that I am looking for a function to do the stuff, I am doing? – Awaish Kumar Jul 01 '20 at 18:48