3

I am pretty new to PG, and trying to convert from MSSQL.

I am working on a function that will return JSON results. This one works:

Create Or Replace Function ExampleTable_SelectList()
Returns JSON As
$$
  Select array_to_json(array_agg(row_to_json(t))) From
    (Select id, value1, value2, From ExampleTable) t
$$ Language SQL;

Now, I want to call can update that returns a value and turn that value into JSON to return. So, this one gives an error on the set command.

Create Or Replace Function ExampleTable_Update (id bigint, value1 text)
  Returns JSON As
$$
  Select row_to_json(t) From
  (
    Update ExampleTable
    Set Value1 = value1
    Where id= id
    Returning Value1, Value2;
  ) t
$$ Language SQL;

I suspect that Postgres does not allow the UPDATE statement as a subquery. Is there anyway around that?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Arotae
  • 103
  • 1
  • 8
  • post the error you're getting. – JP Moresmau Jun 30 '15 at 15:16
  • 1
    `RETURNING` is a postgresql extension & doesn't work everywhere, where subqueries can be used. Try with [CTE](http://www.postgresql.org/docs/current/static/queries-with.html#QUERIES-WITH-MODIFYING) -- *[This command](http://www.postgresql.org/docs/current/static/sql-update.html) conforms to the SQL standard, except that the FROM and RETURNING clauses are PostgreSQL extensions, as is the ability to use WITH with UPDATE.* – pozs Jun 30 '15 at 15:33
  • You forgot to provide your Postgres version, which is *essential* for the best answer. Can we assume the current version 9.4? A table definition would also clarify a lot (what you get with `\d "ExampleTable" in psql` or what you see in the SQL pane in pgAdmin for the table. – Erwin Brandstetter Jun 30 '15 at 19:40
  • It is a 9.4. It looks like a CTE gives the result that works: – Arotae Jun 30 '15 at 19:54
  • Create Or Replace Function ExampleTable_Update (id bigint, value1 text) Returns JSON As $$ With t As ( Update ExampleTable Set Value1 = $2 Where id = $1 Returning Value1, Value2; ) t Select row_to_json(s) From (select Value1, Value2 from t) s $$ Language SQL; – Arotae Jun 30 '15 at 19:57

2 Answers2

10

I see two major problems:

  1. You cannot put an UPDATE into a subquery at all. You could solve that with a data-modifying CTE like Patrick demonstrates, but that is more expensive and verbose than necessary for the case at hand.

  2. You have a potentially hazardous naming conflict, that hasn't been addressed yet.

Better query / function

Leaving the SQL function wrapper aside for the moment (we'll come back to that). You can use a simple UPDATE with a RETURNING clause:

UPDATE tbl
SET    value1 = 'something_new'
WHERE  id = 123
RETURNING row_to_json(ROW(value1, value2));

The RETURNING clause allows arbitrary expressions involving columns of the updated row. That's shorter and cheaper than a data-modifying CTE.

The remaining problem: the row constructor ROW(...) does not preserve column names (which is a known weakness), so you get generic keys in your JSON value:

row_to_json
{"f1":"something_new","f2":"what ever is in value2"}

In Postgres 9.3 you would need a CTE another function to encapsulate the first step or a cast to a well-defined row type. Details:

In Postgres 9.4 just use json_build_object() or json_object():

UPDATE tbl
SET    value1 = 'something_new'
WHERE  id = 123
RETURNING json_build_object('value1', value1, 'value2', value2);

Or:

...
RETURNING json_object('{value1, value2}', ARRAY[value1, value2]);

Now you get original column names or whatever you chose as key names:

row_to_json
{"value1":"something_new","value2":"what ever is in value2"}

It's easy to wrap this in a function, which brings us to your second problem ...

Naming conflict

In your original function you use identical names for function parameters and column names. This is a generally a very bad idea. You would need to understand intimately which identifier comes first in which scope.

In the case at hand the result is utter nonsense:

Create Or Replace Function ExampleTable_Update (id bigint, value1 text) Returns 
...
    Update ExampleTable
    Set Value1 = value1
    Where id = id
    Returning Value1, Value2;
...
$$ Language SQL;

While you seem to expect that the second instance of id would reference the function parameter, it does not. The column name comes first within the scope of an SQL statement, the second instance references the column. resulting in an expression that is always true except for NULL values in id. Consequently, you would update all rows, which could lead to catastrophic loss of data. What's worse, you might not even realize it until later, because the SQL function will return one arbitrary row as defined by the RETURNING clause of the function (returns one row, not a set of rows).

In this particular case, you would get "lucky", because you also have value1 = value1, which overwrites the column with its preexisting value, effectively doing ... nothing, in a very expensive way (unless triggers do something). You may be puzzled to get an arbitrary row with an unchanged value1 as result.

So, don't.

Avoid potential naming conflicts like this unless you know exactly what you are doing (which obviously isn't the case). One convention I like is to prepend an underscore for parameter and variable names in functions, while column names never start with an underscore. In many cases you can just use positional references to be unambiguous: $1, $2, ..., but that sidesteps only one half of the issue. Any method is good as long as you avoid naming conflicts. I suggest:

CREATE OR REPLACE FUNCTION foo (_id bigint, _value1 text)
  RETURNS json AS
  LANGUAGE sql
$func$
UPDATE tbl
SET    value1 = _value1
WHERE  id     = _id
RETURNING json_build_object('value1', value1, 'value2', value2);
$func$;

Also note that this returns the actual column value in value1 after the UPDATE, which may or may not be the same as your input parameter _value1. There could be database rules or triggers interfering ...

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    Thank you very much for your advice. I had switched to the positional reference during testing, but I would much prefer a named reference. I have put all of your suggestions into place. The function is beautiful and the results are perfect. Returning the json_build_object takes me back to a cleaner syntax with great control over results. – Arotae Jul 01 '15 at 13:11
3

You need to put the UPDATE statement in a CTE:

CREATE OR REPLACE FUNCTION ExampleTable_Update (id bigint, value1 text) RETURNS json AS $$
  WITH t(Value2) AS (
    UPDATE ExampleTable
    SET Value1 = $2
    WHERE id = $1
    RETURNING Value2)
  SELECT row_to_json($2, Value2) 
  FROM t;
$$ LANGUAGE sql;

Note that I am using positional parameters $1 and $2 for the function parameters. The names of these parameters are the same as the column names in the table and that is generally a bad idea because of the potential for name-resolution conflicts; see Erwin Brandstetter's answer for a more elaborate explanation.

Patrick
  • 29,357
  • 6
  • 62
  • 90
  • 1
    `row_to_json(($2, Value2))` – klin Jun 30 '15 at 16:10
  • @Arotae: You don't *need* a CTE and `row_to_json($2, Value2)` is potentially wrong / misleading, because what's in the column after the `UPDATE` is not necessarily the same as $2. Think of triggers ... Also, this code avoids the sneaky naming conflict in the question by using the positional reference `$1`, but it should at least be mentioned ... – Erwin Brandstetter Jun 30 '15 at 20:17
  • @ErwinBrandstetter: The question was about the use of an `UPDATE` statement in a sub-query. If the OP wants a sub-query-style `UPDATE` then the only way is a CTE, which can be seen as a *mere syntactic convenience* for a sub-query in the non-recursive form. In the same vein that `ExampleTable` could have triggers that the OP did not mention (I actually realized that issue as soon as I submitted my edit), the `SELECT` query can be far more complex involving tables and/or columns not used in the `UPDATE`. Explained the use of positional parameters in an edit to the answer. – Patrick Jul 01 '15 at 00:45
  • You are right about the UPDATE in the CTE, no argument about that. But the question really is about how to return a JSON value after UPDATE. Subquery or CTE are just means to an end, I submit we don't need *either* here. Also, seeing a CTE as `mere syntactic convenience for a sub-query in the non-recursive form` would filter that CTEs are more expensive, always materialize temporary results and act as optimization barriers (the last bit being irrelevant here). And there are some other corner case differences. Basically, only use a CTE when you need it (which is common enough). – Erwin Brandstetter Jul 01 '15 at 01:14