2

I have PL/PgSQL function that performs quite similar queries (the only thing that they have different is column names) and converts the output into JSON object.

CREATE OR REPLACE FUNCTION get_observations(kind varchar, site_id integer, var varchar) RETURNS TABLE (fc json) AS
$func$
BEGIN
    IF kind = 'raw' THEN
        IF var = 'o2_abs' THEN
            RETURN QUERY SELECT array_to_json(array_agg(row_to_json(obs))) FROM (
                SELECT observation_date AS date, o2_abs AS value FROM oxygen WHERE new_id = site_id) AS obs;
        ELSIF var = 'o2_rel' THEN
            RETURN QUERY SELECT array_to_json(array_agg(row_to_json(obs))) FROM (
                SELECT observation_date AS date, o2_rel AS value FROM oxygen WHERE new_id = site_id) AS obs;
        ELSIF var = 'temp' THEN
            RETURN QUERY SELECT array_to_json(array_agg(row_to_json(obs))) FROM (
                SELECT observation_date AS date, t AS value FROM oxygen WHERE new_id = site_id) AS obs;
        END IF;
    ELSIF kind = 'averaged' THEN
        IF var = 'o2_abs' THEN
            RETURN QUERY SELECT array_to_json(array_agg(row_to_json(obs))) FROM (
                SELECT month AS month, o2_abs AS value FROM oxygen_month_average WHERE new_id = site_id) AS obs;
        ELSIF var = 'o2_rel' THEN
            RETURN QUERY SELECT array_to_json(array_agg(row_to_json(obs))) FROM (
                SELECT month AS month, o2_rel AS value FROM oxygen_month_average WHERE new_id = site_id) AS obs;
        ELSIF var = 'temp' THEN
            RETURN QUERY SELECT array_to_json(array_agg(row_to_json(obs))) FROM (
                SELECT month AS month, t AS value FROM oxygen_month_average WHERE new_id = site_id) AS obs;
        END IF;
    END IF;
END;
$func$ LANGUAGE plpgsql;

The body of function contains a lots of repeated code. I see several methods to improve it, but don't know if PL/PgSQL allows these tricks:

  1. Store obs query result in the intermediate variable and convert it into JSON with array_to_json(array_agg(row_to_json(obs))) in the end of function.
  2. Cast var varchar into column name definition in query in order to avoid the most of IF/ELSE statements;

PostgreSQL server version is 9.3.6.

Table schemas:

oxygen=# \d+ oxygen
Table "public.oxygen"
         Column          |  Type   |                      Modifiers       (...)
-------------------------+---------+----------------------------------------------------
 old_id                  | text    | not null
 observation_date_string | text    | not null
 t                       | real    | not null
 o2_abs                  | real    | not null
 o2_sat                  | real    |
 o2_rel                  | real    |
 observation_date        | date    |
 new_id                  | integer |
 id                      | bigint  | not null default nextval('oxygen_id_seq'::regclass)
Indexes:
    "oxygen_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "oxygen_new_id_fkey" FOREIGN KEY (new_id) REFERENCES unique_geoms(new_id)
    "oxygen_old_id_fkey" FOREIGN KEY (old_id) REFERENCES location(old_id)


oxygen=# \d+ oxygen_month_average
Table "public.oxygen_month_average"
 Column |       Type       |                             Modifiers        (...)
--------+------------------+-------------------------------------------------------------------
 new_id | integer          |
 month  | integer          |
 t      | double precision |
 o2_abs | double precision |
 o2_rel | double precision |
 id     | bigint           | not null default nextval('oxygen_month_average_id_seq'::regclass)
Indexes:
    "oxygen_month_average_pkey" PRIMARY KEY, btree (id)
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Vitaly Isaev
  • 5,392
  • 6
  • 45
  • 64
  • 1
    You can use `json_agg()` instead of `array_to_json(array_agg())` – Ihor Romanchenko May 15 '15 at 09:48
  • As *always*, your version of Postgres, please. A table definition and some sample values wouldn't hurt either. One instance has `t AS value`, the next has `temp AS value`. Is that a typo or intended? – Erwin Brandstetter May 16 '15 at 03:33
  • @ErwinBrandstetter thank you, that really was a typo. I've updated the post. Regarding the Postgres version, I can move to the most recent. – Vitaly Isaev May 16 '15 at 09:59

2 Answers2

2

You can use EXECUTE statement in psql to generate dynamic queries.

But it would not cache query plans so it can make each invocation slightly slower. So your code isn't that bad, regardless of code repetition.

But add error checking for invalid arguments. Otherwise you'd have very hard to find bugs if you make a typo like avergaed somewhere.

Tometzky
  • 22,573
  • 5
  • 59
  • 73
2

There is another, very efficient way to simplify most of it: a CASE statement:

CREATE OR REPLACE FUNCTION get_observations(_kind    text
                                          , _site_id int
                                          , _var     text)
  RETURNS TABLE (fc json)
  LANGUAGE plpgsql AS
$func$
BEGIN
   CASE _kind
   WHEN 'raw' THEN
      RETURN QUERY
      SELECT json_agg(obs) FROM (
         SELECT observation_date AS date
              , CASE _var
                WHEN 'o2_abs' THEN o2_abs
                WHEN 'o2_rel' THEN o2_rel
                WHEN 'temp'   THEN t
                END AS value
         FROM   oxygen
         WHERE  new_id = _site_id
         ) AS obs;

    WHEN 'averaged' THEN
      RETURN QUERY
      SELECT json_agg(obs) FROM (
         SELECT month AS month
              , CASE _var
                WHEN 'o2_abs' THEN o2_abs
                WHEN 'o2_rel' THEN o2_rel
                WHEN 'temp'   THEN t
                END AS value
         FROM   oxygen_month_average
         WHERE  new_id = _site_id
         ) AS obs;
   END CASE;
END
$func$;

fiddle
Old sqlfiddle

Still two distinct queries because those are based on different tables. To fold those, too, you would need dynamic SQL with EXECUTE, but that wouldn't get much shorter and is typically less efficient ...

Use a "simple" (or "switched") CASE. Note the CASE keyword in two distinct contexts: the outer CASE is a PL/pgSQL control structure, the inner CASE is an SQL command. Related:

You can simplify even further with json_agg() like @Igor already suggested:
array_to_json(array_agg(row_to_json(obs))) -> json_agg(obs)

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228