2

I have a table t with several columns, let's name them a, b and c. I also have a state column which indicates the current state. There is also an id column.

I want to write the following query: update column a always, but b and c only if the application state is still equal to the database state. Here, the state column is used for optimistic locking.

I wrote this query as following:

UPDATE t
SET    a = $a$,
       b = (CASE WHEN state = $state$ THEN $b$ ELSE b END),
       c = (CASE WHEN state = $state$ THEN $c$ ELSE c END)
WHERE  id = $id$ AND
       (
         a != $a$ OR
         b != (CASE WHEN state = $state$ THEN $b$ ELSE b END) OR
         c != (CASE WHEN state = $state$ THEN $c$ ELSE c END)
       )

Here, $id$, $a$, ... are input variables from the application. The second part of the WHERE clause is to avoid updates which do not effectively update anything.

This query works as expected, but is very clumsy. I am repeating the same condition several times. I am looking for a way to rewrite this query in a more elegant fashion. If this was a simple SELECT query, I could do something with a LATERAL JOIN, but I cannot see how to apply this here.

How can I improve this query?

  • Can you clarify what you want to do with a? "Update column a always" or "avoid updates which do not effectively update anything."? – LJ01 Feb 16 '18 at 09:58
  • Column `a` should be updated regardless the `state`. Column `b` and `c` should be guarded by the `state`. The `WHERE` clause is to prevent `zero updates`: I don't want my MVCC database to create a new row when the new row will be completely identical to the current row. – schizofrenic_bit Feb 16 '18 at 10:01

4 Answers4

2

Split the query in two:

UPDATE t
SET    a = $a$
WHERE  id = $id$

UPDATE t
SET    b = $b$,
       c = $c$
WHERE  id = $id$ AND
       state = $state$

If you need atomicity, wrap in a transaction.

kutschkem
  • 7,826
  • 3
  • 21
  • 56
  • For the record, I don't think checking whether the update would change anything helps. Prove me wrong, measure execution time. – kutschkem Feb 16 '18 at 09:47
  • 1
    yes it helps to prevent new row versions to be creatred (multiply by "write multiplication") – joop Feb 16 '18 at 09:52
  • 1
    If you have millions of rows and you are only effectively updating a fraction of them, checking whether the update effectively does anything helps a lot. It gets worse if there is replication, as every change has to be logged and replicated. The query execution time might be more or less identical, but the overall load of all background processes combined might be very different. It depends on the case. – schizofrenic_bit Feb 16 '18 at 09:52
  • @schizofrenic_bit OK makes sense. – kutschkem Feb 16 '18 at 10:46
1

This seems a bit cleaner(untested):


WITH src AS (
        SELECT    $a$ AS a
       , (CASE WHEN state = $state$ THEN $b$ ELSE b END) AS b
       , (CASE WHEN state = $state$ THEN $c$ ELSE c END) AS c
        FROM t
        WHERE  id = $id$
        )
UPDATE t dst
SET a=src.a, b=src.b, c=src.c
FROM src
WHERE  dst.id = src.id
AND   (src.a, src.b, src.c) IS DISTINCT FROM (dst.a, dst.b, dst.c)
        ;
joop
  • 4,330
  • 1
  • 15
  • 26
  • Thank you. This looks better, but still not 100% convinced :). Maybe I am being a perfectionist. – schizofrenic_bit Feb 16 '18 at 09:57
  • Note: the `... DISTINCT FROM ...` is only needed if you want NULLs to compare unqual to non-NULLs. Otherwise, you could use your existing `(... OR ... OR...)` condition. – joop Feb 16 '18 at 10:17
0

The only filter you need is on ID = $id

The case statement says don't change it in the update if the state doesn't match, so you don't need to filter it.

EDIT

where Id = $id and a !=$a
   Or (state = $state and (b !=b or c!= $c))

If you do any more than that then"always update a" will not necessary be true.

3rd attempt checks for the possibility of a remaining the same, but b or c updating.

LJ01
  • 589
  • 4
  • 11
0

EDIT: It Took me a while to realize my fault here: The question obviously targets at a single update, while my answer tried to update many rows. However, if you need to execute this Update for a set of rows you could:

  • Insert the needed parameters in a temporary table
  • Join that table within the "t2" subquery
  • Select it's columns (e.g. tempTable.b As tempB)
  • Replace the Parameters (e.g. $b$ -> t2.tempB)

.

UPDATE t
SET a=source.a,
    b=source.b,
    c=source.c
FROM 
    (
        SELECT 
            id,
            a, 
            (CASE WHEN UpdateCondition THEN $b$ ELSE b END) AS b,
            (CASE WHEN UpdateCondition THEN $c$ ELSE c END) AS c
        FROM 
            (  
                SELECT state = $state$ As UpdateCondition, * FROM t
            ) As t2
        WHERE 
            id = $id$ AND
            (
                a != $a$ OR
                b != (CASE WHEN UpdateCondition THEN $b$ ELSE b END) OR
                c != (CASE WHEN UpdateCondition THEN $c$ ELSE c END)
            ) AS source
WHERE t.id=source.id;

The Sub query for t2 gives you your state Condition and executes the calculation for it only once per row.

The subquery for "source" gives you the mapped values and filters those without changes.

Kooooons
  • 109
  • 5
  • See Also: https://stackoverflow.com/questions/6256610/updating-table-rows-in-postgres-using-subquery – Kooooons Feb 16 '18 at 10:02