1

On any update to the row (which would be somehow dumb and I would expect a performance warning on the documentation page then) or is it smart enough of analyzing the generation expression and only regenerate the computed column when the input column(s) have changed?

springy76
  • 3,706
  • 2
  • 24
  • 46

1 Answers1

3

From the documentation it's rather clear

A stored generated column is computed when it is written (inserted or updated) and occupies storage as if it were a normal column. A virtual generated column occupies no storage and is computed when it is read. Thus, a virtual generated column is similar to a view and a stored generated column is similar to a materialized view (except that it is always updated automatically).

So it seams that the generated always column is generated always.

Below a small test case to verify

We define a immutable function used in the formula with pg_sleepinside to see if the function was called

create or replace function wait10s(x numeric)
returns int
as $$
  SELECT pg_sleep(10);
  select x as result;
$$ language sql IMMUTABLE;

Table DDL

create table t 
(col1 numeric,
col2 numeric,
gen_col numeric generated always as ( wait10s(col2) ) STORED
);

Insert

as expected we wait 10 seconds

 insert into t (col1, col2) values (1,1);

Update of column used in formula

update t set col2 = 2 

Again expected wait

Update of column NOT used in formula

update t set col1 = 2 

No wait so it seems that there is an optimizing step calling the formula only in case of necessity.

This makes perfect sense, but of course you should take it with care as this behavior is not documented and may change...

Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53
  • 2
    I think the part that springy76 is looking for is not in your quote, mainly: "*a generated column is updated **whenever the row changes** and cannot be overridden.*" –  Jan 25 '23 at 09:42
  • what it clear? what happens on `update`? – springy76 Jan 25 '23 at 09:42
  • 1
    @springy76 from the link: "*a generated column is updated whenever the row changes*". To get more details about the implementation you probably need to read the source code. –  Jan 25 '23 at 09:51
  • sorry @springy76 now I get it, you are asking if the formula is calulated even in case the input columns are unchanged in the update - *updating the answer* – Marmite Bomber Jan 25 '23 at 10:14
  • @JGH `generated always` columns *require* (for a good reason) immutable expressions, otherwise -> `ERROR: generation expression is not immutable` – Marmite Bomber Jan 25 '23 at 15:38
  • Nope @JGH, *insert a new row with col2=2* takes an other 10 seconds. Also `select wait10s(1) from generate_series(1,2)` takes **20** seconds. BTW I can't find the *well documented* desciption of the *function cache*, found only [this](https://stackoverflow.com/a/20718747/4808122) – Marmite Bomber Jan 26 '23 at 09:53
  • 1
    Alright, I got it wrong (no caching, but execution of the function at planning time and then replacement of its call(s) by constant value(s)) and this mechanism isn't involved here, so you are right about the undocumented optimization ( for reading pleasure: [1](https://postgrespro.com/list/thread-id/1138881) [2](https://postgrespro.com/list/thread-id/2036223) [3](https://stackoverflow.com/questions/8529690/why-is-postgresql-calling-my-stable-immutable-function-multiple-times) ) – JGH Jan 26 '23 at 15:01
  • Ok, what's the current status? Tests discovered it might be optimized/smart but there is no documentation and no guarantee? In my case I'm calling a complex and expensive expression for the generated column: `to_tsvector(jsonb_path_query_array(bigJsonColumn, '$.Pages[*].TextBlocks[*].Lines[*].Words[*].Text'))` - if this gets called for any update to the row, then the DB will get doomed; so I guess I'll be on the safe side updating the column by client code (as bigJsonColumn is only updated at 1-2 code points). – springy76 Jan 30 '23 at 09:51