JGH provided a smart way to UPDATE
multiple columns at once from a (correlated) subquery. Possible since Postgres 9.5. However, there are considerable downsides.
That form updates every single row in the table unconditionally. You would have to repeat the calculation in the WHERE
clause to do it conditionally, which would defy the purpose of doing it once only.
In PostgreSQL's MVCC model UPDATE
basically means writing a new version of the complete row, bloating table and indexes. Expensive by itself, but also causing a lot of additional work for VACUUM
- and/or performance degradation if you can't or don't clean up. See:
Either you really want to update every row, then that's basically fine. Though, if you are not bound by concurrent load or internal references to the table (views, FKs, ...), consider writing a new table instead. Typically cheaper overall.
However, some (or many/most?) rows may be up to date already. Then that's a huge waste. Consider instead:
UPDATE scratch.intersections AS i
SET legs = r.legs
, streets = r.streets
FROM scratch.intersections x
JOIN LATERAL (
SELECT count(*) AS legs -- assuming r.geom is NOT NULL
, array_agg(DISTINCT s.NAME ORDER BY s.NAME)
FILTER (WHERE s.NAME IS NOT NULL) AS streets
FROM received.streets s
WHERE ST_DWithin(x.geom, s.geom, 2)
) r ON x.legs IS DISTINCT FROM r.legs
OR x.streets IS DISTINCT FROM r.streets
WHERE i.id = x.id; -- id being the PK (or at least UNIQUE)
This does not touch rows that don't actually change.
We need the additional instance of scratch.intersections
in the FROM
clause to allow the LATERAL
join. But we eliminate all rows without changes before applying the update - this way saving most of the work for every row that's already up to date. Well, since the subquery is relatively expensive, maybe not most of the work. But typically, the actual write operation is much more expensive than calculating new values.
This is pretty close to what you had in mind in your last paragraph. But more efficient without creating a temp table, and you also save empty writes.
And if you only need to update possibly affected rows in intersections
after changing few rows in streets.geom
, there is more optimization potential, yet. Not going there, seems beyond the scope of this question.
Related: