2

I am trying write function which open cursor with dynamic column name in it. And I am concerned about obvious SQL injection possibility here. I was happy to see in the fine manual that this can be easily done, but when I try it in my example, it goes wrong with

error: column does not exist.

My current attempt can be condensed into this SQL Fiddle. Below, I present formatted code for this fiddle.

The goal of tst() function is to be able to count distinct occurances of values in any given column of constant query.

I am asking for hint what am I doing wrong, or maybe some alternative way to achieve the same goal in a safe way.

    CREATE TABLE t1 (
        f1 character varying not null,
        f2 character varying not null
    );
    CREATE TABLE t2 (
        f1 character varying not null,
        f2 character varying not null
    );
    INSERT INTO t1 (f1,f2) VALUES ('a1','b1'), ('a2','b2');
    INSERT INTO t2 (f1,f2) VALUES ('a1','c1'), ('a2','c2');

    CREATE OR REPLACE FUNCTION tst(p_field character varying)
        RETURNS INTEGER AS
    $BODY$ 
    DECLARE 
        v_r record; 
        v_cur refcursor; 
        v_sql character varying := 'SELECT count(DISTINCT(%I)) as qty 
                                    FROM t1 LEFT JOIN t2 ON (t1.f1=t2.f1)'; 
    BEGIN  
        OPEN v_cur FOR EXECUTE format(v_sql,lower(p_field)); 
        FETCH v_cur INTO v_r; 
        CLOSE v_cur; 
        return v_r.qty; 
    END; 
    $BODY$ 
    LANGUAGE plpgsql;

Test execution:

SELECT tst('t1.f1')

Provides error message:

ERROR: column "t1.f1" does not exist
Hint: PL/pgSQL function tst(character varying) line 1 at OPEN
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
T.Z.
  • 2,092
  • 1
  • 24
  • 39

1 Answers1

2

This would work:

SELECT tst('f1');

The problem you are facing: format() interprets parameters concatenated with %I as one identifier. You are trying to pass a table-qualified column name that consists of two identifiers, which is interpreted as "t1.f1" (one name, double-quoted to preserve the otherwise illegal dot in the name.

If you want to pass table and column name, use two parameters:

CREATE OR REPLACE FUNCTION tst2(_col text, _tbl text = NULL)
  RETURNS int AS
$func$
DECLARE
   v_r record;
   v_cur refcursor;
   v_sql text := 'SELECT count(DISTINCT %s) AS qty
                  FROM t1 LEFT JOIN t2 USING (f1)';
BEGIN
   OPEN v_cur FOR EXECUTE
      format(v_sql, CASE WHEN _tbl <> ''  -- rule out NULL and ''
                         THEN quote_ident(lower(_tbl)) || '.' ||
                              quote_ident(lower(_col))
                         ELSE quote_ident(lower(_col)) END);
   FETCH v_cur INTO v_r;
   CLOSE v_cur;
   RETURN v_r.qty;
END
$func$ LANGUAGE plpgsql;

Aside: It's DISTINCT f1- no parentheses around the column name, unless you want to make it a row type.

Actually, you don't need a cursor for this at all. Faster, simpler:

CREATE OR REPLACE FUNCTION tst3(_col text, _tbl text = NULL, OUT ct bigint) AS
$func$
BEGIN
   EXECUTE format('SELECT count(DISTINCT %s) AS qty
                   FROM t1 LEFT JOIN t2 USING (f1)'
                 , CASE WHEN _tbl <> ''  -- rule out NULL and ''
                        THEN quote_ident(lower(_tbl)) || '.' ||
                             quote_ident(lower(_col))
                        ELSE quote_ident(lower(_col)) END)
   INTO ct;
   RETURN;
END
$func$ LANGUAGE plpgsql;

I provided NULL as parameter default for convenience. This way you can call the function with just a column name or with column and table name. But not without column name.

Call:

SELECT tst3('f1', 't1');
SELECT tst3('f1');
SELECT tst3(_col := 'f1');

Same as for test2().

SQL Fiddle.

Related answer:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    Ha! You have nailed this. Thank You. I guess it is obviuos, if You focus on semantic meaning of _SQL identifier_... I did not. Also, You should look out when using `USING`, Your functions will fail when You try to invoke them like this: `tst2('f2')` . :) – T.Z. May 13 '15 at 07:35
  • Yes. May be better to stay explicit with `ON t1.f1 = t2.f1` (parentheses would be noise, though). – Erwin Brandstetter May 13 '15 at 08:45