72

I have a rather complicated query on my PostgreSQL database spanning 4 tables via a series of nested subqueries. However, despite the slightly tricky looking appearance and setup, ultimately it will return two columns (from the same table, if that helps the situation) based on that matching of two external parameters (two strings need to match with fields in different tables). I'm fairly new to database design in PostgreSQL, so I know that this seemingly magical thing called Views exist, and that seems like it could help me here, but perhaps not.

Is there some way I can move my complex query inside a view and somehow just pass it the two values I need to match? That would greatly simplify my code on the front-end (by shifting the complexities to the database structure). I can create a view that wraps my static example query, and that works just fine, however that only works for one pair of string values. I need to be able to use it with a variety of different values.

Thus my question is: is it possible to pass parameters into an otherwise static View and have it become "dynamic"? Or perhaps a View is not the right way to approach it. If there's something else that would work better, I'm all ears!

*Edit: * As requested in comments, here's my query as it stands now:

SELECT   param_label, param_graphics_label
  FROM   parameters
 WHERE   param_id IN 
         (SELECT param_id 
            FROM parameter_links
           WHERE region_id = 
                 (SELECT region_id
                    FROM regions
                   WHERE region_label = '%PARAMETER 1%' AND model_id =
                         (SELECT model_id FROM models WHERE model_label = '%PARAMETER 2%')
                 )
         ) AND active = 'TRUE'
ORDER BY param_graphics_label;

Parameters are set off by percent symbols above.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Devin
  • 996
  • 2
  • 8
  • 19

5 Answers5

86

You could use a set returning function:

create or replace function label_params(parm1 text, parm2 text)
  returns table (param_label text, param_graphics_label text)
as
$body$
  select ...
  WHERE region_label = $1 
     AND model_id = (SELECT model_id FROM models WHERE model_label = $2)
  ....
$body$
language sql;

Then you can do:

select *
from label_params('foo', 'bar')

Btw: are you sure you want:

AND model_id = (SELECT model_id FROM models WHERE model_label = $2)

if model_label is not unique (or the primary key) then this will throw an error eventually. You probably want:

AND model_id IN (SELECT model_id FROM models WHERE model_label = $2)
  • 1
    You're correct in guessing that `model_label` is not the primary key. It "should" be unique but that isn't necessarily software-enforced. (`model_id` is the primary key of `models`). From what I recall, switching to IN shouldn't harm the fact that I only want to match to one entry, correct? – Devin Jul 09 '12 at 19:49
  • 2
    @Devin: correct, if you use IN it will work even if more than one row is returned. If not, you'll receive an error at runtime. –  Jul 09 '12 at 19:51
  • 1
    Awesome, that'll work exactly as I need it to. And learned something cool about PostgreSQL functions. Thanks for your help! – Devin Jul 09 '12 at 20:10
35

In addition to what @a_horse already cleared up, you could simplify your SQL statement with JOIN syntax instead of nested subqueries. Performance will be similar, but the syntax is much shorter and easier to manage.

CREATE OR REPLACE FUNCTION param_labels(_region_label text, _model_label text)
  RETURNS TABLE (param_label text, param_graphics_label text)
  LANGUAGE sql AS
$func$
SELECT p.param_label, p.param_graphics_label
FROM   parameters      p 
JOIN   parameter_links l USING (param_id)
JOIN   regions         r USING (region_id)
JOIN   models          m USING (model_id)
WHERE  p.active
AND    r.region_label = $1 
AND    m.model_label = $2
ORDER  BY p.param_graphics_label;
$func$;
  • If model_label is not unique or something else in the query produces duplicate rows, you may want to make that SELECT DISTINCT p.param_graphics_label, p.param_label - with a matching ORDER BY clause for best performance. Or use GROUP BY.

  • Since Postgres 9.2 you can use the declared parameter names in place of $1 and $2 in SQL functions. (Has been possible for PL/pgSQL functions for a long time).

  • To avoid naming conflicts, I prefix parameter names with _ (those are visible most everywhere inside the function) and table-qualify column names in queries.

  • I simplified WHERE p.active = 'TRUE' to WHERE p.active, assuming the column active is type boolean.

  • USING in the JOIN condition only works if the column names are unambiguous across all tables to the left. Else use the more explicit syntax: ON l.param_id = p.param_id

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    Oh wow, that is much nicer! I know joins are out there and would have been handy for this problem, but I guess I just didn't work up the gumption to try and figure out how to use it here. As it turns out, it's not so bad! – Devin Jul 09 '12 at 20:20
30

In most cases the set-returning function is the way to go, but in the event that you want to both read and write to the set, a view may be more appropriate. And it is possible for a view to read a session parameter:

CREATE VIEW widget_sb AS
  SELECT * FROM widget
  WHERE column = cast(current_setting('mydomain.myparam') as int)

SET mydomain.myparam = 0
select * from widget_sb
[results]

SET mydomain.myparam = 1
select * from widget_sb
[distinct results]
Martin Tournoij
  • 26,737
  • 24
  • 105
  • 146
Drew
  • 8,675
  • 6
  • 43
  • 41
3

I don't think a "dynamic" view as you stated is possible.

Why not write a stored procedure that takes 2 arguments instead?

Jin Kim
  • 16,562
  • 18
  • 60
  • 86
2

I would rephrase the query as the following:

SELECT   p.param_label, p.param_graphics_label
  FROM   parameters p
where exists (
    select 1
    from parameter_links pl
    where pl.parameter_id = p.id
    and exists (select 1 from regions r where r.region_id = pl.region_id
) and p.active = 'TRUE'
order by p.param_graphics_label;

Assuming that you have indexes on the various id columns, this query should be significantly faster than using the IN operator; the exists parameters here will use only the index values without even touching the data table except for getting the final data from the parameters table.

Paolo
  • 20,112
  • 21
  • 72
  • 113
Monty
  • 361
  • 1
  • 3
  • 8