2

Problem definition:

  • I have a Postgres table A holding an arbitrary amount of different columns with different types, a single serial PK and millions of rows. Example:
id | val1 | val2  | val3
---+------+-------+-----
1  | 45.2 | 52.6  | 222
2  | 5.32 | 12.34 | 193
  • I have another table B, mapping table A column names to free text string names (unique).
value_name | col_name
-----------+---------
 "Value 1" | "val1"
 "Value 2" | "val2"
 "Value 3" | "val3"
  • I want to run select queries on table A to retrieve data by free text names (instead of column names). A single query should return the data.

Something like this would be great:

SELECT id
  , alias('Value 2', 'A')
  , alias('Value 3', 'A')
from  A

or

SELECT *
from alias('Value 2', 'Value 3', 'A')

would return

id | val2  | val3
---+-------+-----
1  | 52.6  | 222
2  | 12.34 | 193

(I don't mind returning id always without explicitly asking for it, but I do need an ability to query the arbitrary amount of columns and return their original column names)

Solution flexibility

  • The solution can include Postgres functions, rules, extensions, triggers - anything which does not change the basic structure of table A.
  • The solution can suggest any structure for table B.
  • Postgres version 11

Tnx!!

Alessio Cantarella
  • 5,077
  • 3
  • 27
  • 34
Meir Tseitlin
  • 1,878
  • 2
  • 17
  • 28
  • 1
    I tried checking the dynamic SQL options from [here](https://stackoverflow.com/questions/11740256/refactor-a-pl-pgsql-function-to-return-the-output-of-various-select-queries/11751557#11751557), but haven't got any helpful solution. – Meir Tseitlin Sep 17 '19 at 20:42

2 Answers2

2

You can create a view like this:

CREATE VIEW A_with_col_names AS select val1 "Value 1", val2 "Value 2" ...  from A;

Then the view can be used like:

Select * from A_with_col_names where "Value 1" = 'xyz'
Rohit
  • 2,132
  • 1
  • 15
  • 24
  • That's a very interesting approach. Yet when creating a view you cannot really create indexes (unless it's a materialized view which must be refreshed). When having millions of rows - it's a problem...... Maybe there is a way to improve it somehow? – Meir Tseitlin Sep 25 '19 at 17:27
  • 2
    @Miro : In a matter of fact the table A will be indexed. So there is no problem I think. – Jaisus Sep 26 '19 at 07:14
  • 2
    @Miro Creating an index on the underlying table is enough for such a simple view. – Bergi Sep 26 '19 at 22:15
1

I can see a solution. It is not perfect though.

The principle is to build a temp table dynamically, filled it after that.

So for the following test base :

create table test
(
    id int primary key
    , val1 float
    , val2 float
    , val3 float
);

insert into test values
(1, 45.2, 52.6, 222), (2,5.32,12.34,193);

create table test_2
(
    value_name varchar
    , col_name varchar
);

insert into test_2 values 
('Value 1', 'val1')
, ('Value 2', 'val2')
, ('Value 3', 'val3') ;

you can implement the following procedure :

create or replace procedure querier (in_params varchar[])
language plpgsql 
as $$
declare
    query varchar;
    column_names varchar;
begin
    -- you can do pretty much all you want here to build your columns
    -- for example add a column 'type' to your second table
    select 
        string_agg(col_name, ', ') into column_names
    from test_2
    where value_name in (select unnest(in_params));

    drop table if exists temp_test cascade;
    query := 'CREATE TEMP TABLE temp_test AS 
    select id, '||column_names||' 
    from test;';
    execute query;
    --raise notice '%', query;
end;
$$;

And finally call the procedure and select the whole temp table:

call querier(cast('{"Value 1", "Value 3"}' as varchar[]));
select * from temp_test;
Jaisus
  • 1,019
  • 5
  • 14
  • Interesting! Isn't it a huge overhead to build a temp table on every query? tnx – Meir Tseitlin Sep 25 '19 at 22:07
  • That's another point ;) What is the volum of IO that you are expecting? How many lines are going to be in your table? And to be honest : what is the basical need to such an architecture ? (it seems a little odd to me) – Jaisus Sep 26 '19 at 07:01
  • I shortened the body of the procedure. – Jaisus Sep 26 '19 at 07:12
  • It is an interesting approach I haven't investigated, yet it still required 2 queries at the end, so I don't see what are the benefits (aside from performance drawbacks). It's more efficient just to query the real column names 1st and query the data next. – Meir Tseitlin Sep 26 '19 at 13:05
  • Indeed. And I do think that the approach of @Rohit is more fit to your needs in this matter :) – Jaisus Sep 26 '19 at 14:41