3

Suppose i have a table named t which is stored in a postgresql database. I have 6 columns named a, b, c, d, e, f. Columns a, b and c take values between 0 and 100, but on an aritmetic scale 0 < a < b < c < 100. Columns d, e and f take any value in the range 1 to 10.

I want to compute the weighted average of columns d, e and f, but with a condition related to columns a, b and c. The condition is that the average will be computed only on the a, b and c columns that have values smaller than 50.

I think this need a function so I started to do it:

CREATE OR REPLACE FUNCTION example(t character varying, a character varying, b character varying, c character varying, d character varying, e character varying, f character varying, g character varying) RETURNS double precision AS $$

BEGIN
    ALTER TABLE t ADD COLUMN g double precision;
        UPDATE t
            IF a > 50 THEN
                SET g = d;
            ELSE 
                IF b > 50 THEN;
                    SET g = (d+e)/2;
                END IF c > 50 THEN
                        SET g = (d+e+f)/3;
            END IF;
END;
$$ LANGUAGE plpgsql;

I get the following error:

ERROR:  syntax error at or near "$1"
LINE 1: ALTER TABLE  $1  ADD COLUMN  $2  double precision
                 ^
QUERY:  ALTER TABLE  $1  ADD COLUMN  $2  double precision
CONTEXT:  SQL statement in PL/PgSQL function "example" near line 2

********** Error **********

ERROR: syntax error at or near "$1"
SQL state: 42601
Context: SQL statement in PL/PgSQL function "example" near line 2

Can someone tell me were I am wrong so I can go ahead with computing the needed average?

Mihai Niculita
  • 59
  • 1
  • 1
  • 4
  • You cannot pass the name of a table as a paramter in such a way. But it seems rather useless to add a new column each time you run that function. Why do you want to do that? –  May 06 '12 at 21:12
  • Thanks for the points! About the SET command, indedd is wrong. But can I use IF inside of SET? And how I will use it? 'UPDATE t SET g = D IF a > 50'? – Mihai Niculita May 07 '12 at 04:35

2 Answers2

3

The immediate cause of the error are naming conflicts. You define parameters t and g and use the same names in your ALTER TABLE statement. I make it a habit to prefix parameter names (like _t, _g) so they won't conflict with other names in the function body.

Also, your parameters should not be defined character varying since they hold numerical values. Use an appropriate numerical type, probably double precision.

But I don't think you need a function at all. This can be solved with plain SQL statements:

ALTER TABLE tbl ADD COLUMN g double precision;

UPDATE tbl
SET g = CASE
           WHEN a > 50 THEN d 
           WHEN b > 50 THEN (d+e)/2
           WHEN c > 50 THEN (d+e+f)/3
           ELSE 0  -- definition for ELSE case is missing
        END;

You could also scratch the whole idea completely and use a view for the purpose, as g only holds functionally dependent data:

CREATE VIEW tbl_with_g AS
SELECT *
     , CASE
          WHEN a > 50 THEN d 
          WHEN b > 50 THEN (d+e)/2
          WHEN c > 50 THEN (d+e+f)/3
          ELSE 0
       END AS g
FROM   tbl;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks for the points! But the parameters are not used in the function? After I define them I should add variables to be used in the function? – Mihai Niculita May 07 '12 at 04:41
  • About your idea with the CASE function, I have already applied before creating the function. It compute something indeed, but when I did a verification and it seems that in CASE function you cannot use different columns, because the evaluation stops if the first WHEN condition is true. So while the first condition is true the value is set to d, but the second condition is not evaluated, but this is what i need, to do the average when the condition is false. – Mihai Niculita May 07 '12 at 04:42
  • @MihaiNiculita Then you should adjust your conditions to be mutually exclusive. If you want the first `THEN` clause skipped when `b > 50`, then use this as your first `WHEN`: `a > 50 AND b <= 50` (which is equivalent to `a > 50 AND NOT b > 50`). – jpmc26 Jul 08 '13 at 02:36
3

I totally agree with everything in Erwin's answer, but want to point out one other option. You can create a sort of "generated column" which will be calculated on demand like this:

CREATE FUNCTION g(rec t)
  RETURNS double precision
  IMMUTABLE
  LANGUAGE SQL
AS $$
  SELECT CASE
           WHEN $1.a > 50 THEN $1.d 
           WHEN $1.b > 50 THEN ($1.d+$1.e)/2
           WHEN $1.c > 50 THEN ($1.d+$1.e+$1.f)/3
           ELSE 0
         END;
$$;

You can then reference g just as you would a column, as long as the reference is qualified by the table name or alias. For example:

SELECT *, t.g FROM t;

When an apparent qualified column reference doesn't resolve to a column, the PostgreSQL planner looks for a function by that name which takes the table's record type as its only parameter, and executes that function. Sometimes this approach is more convenient than using a view, although the effect is basically the same.

kgrittn
  • 18,113
  • 3
  • 39
  • 47
  • +1 A cool feature of PostgreSQL! A related question came up and I added a detailed explanation as to why this works: http://stackoverflow.com/questions/11165450/store-common-query-as-column/11166268#11166268 – Erwin Brandstetter Jun 23 '12 at 16:05