94

I have a materialized view on a PostgreSQL 9.3 database which seldom changes (about twice a day). But when it does, I'd like to update its data promptly.

Here is what I was thinking about so far:

There is a materialized view mat_view which gets its data from tables table1 and table2 using some join statement.

Whenever something in table1 or table2 changes, I already have a trigger which updates a little configuration table config consisting of

table_name | mat_view_name | need_update
-----------+---------------+------------
table1     | mat_view      | TRUE/FALSE
table2     | mat_view      | TRUE/FALSE

So if anything in table1 changes (there's a trigger on UPDATE and on DELETE for every statement), the field need_update in the first row is set to TRUE. The same goes for table2 and the second row.

Obviously, if need_update is TRUE, then the materialized view must be refreshed.

UPDATE: Since materialized views do not support rules (as @pozs mentioned in a comment below), I would go one step further. I'd create a dummy view v_mat_view with the definition "SELECT * FROM mat_view". When the user does a SELECT on this view, I need to create a rule ON SELECT which does the following:

  • check whether mat_view should be updated (SELECT 1 FROM config WHERE mat_view_name='mat_view' AND need_update=TRUE)
  • reset the need_update flag with UPDATE config SET need_update=FALSE where mat_view_name='mat_view'
  • REFRESH MATERIALIZED VIEW mat_view
  • and at last do the original SELECT statement but with mat_view as the target.

UPDATE2: I tried creating the steps above:

Create a function that handles the four points mentioned above:

CREATE OR REPLACE FUNCTION mat_view_selector()
RETURNS SETOF mat_view AS $body$
BEGIN
  -- here is checking whether to refresh the mat_view
  -- then return the select:
  RETURN QUERY SELECT * FROM mat_view;
END;
$body$ LANGUAGE plpgsql;

Create the view v_mat_view which really selects from the function mat_view_selector:

CREATE TABLE v_mat_view AS SELECT * from mat_view LIMIT 1;
DELETE FROM v_mat_view;

CREATE RULE "_RETURN" AS
    ON SELECT TO v_mat_view
    DO INSTEAD 
        SELECT * FROM mat_view_selector();
    -- this also converts the empty table 'v_mat_view' into a view.

The result is unsatisfying:

# explain analyze select field1 from v_mat_view where field2 = 44;
QUERY PLAN
Function Scan on mat_view_selector (cost=0.25..12.75 rows=5 width=4)
(actual time=15.457..18.048 rows=1 loops=1)
Filter: (field2 = 44)
Rows Removed by Filter: 20021
Total runtime: 31.753 ms

in comparison to selecting from the mat_view itself:

# explain analyze select field1 from mat_view where field2 = 44;
QUERY PLAN
Index Scan using mat_view_field2 on mat_view (cost=0.29..8.30 rows=1 width=4)
  (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: (field2 = 44)
Total runtime: 0.036 ms

So essentially it DOES work, but performance might be an issue.

Anyone have better ideas? If not, then I would have to implement it somehow in the application logic or worse: run a simple cronjob that runs every minute or so.

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
mawimawi
  • 4,222
  • 3
  • 33
  • 52
  • Interesting approach. Have you tried replacing your view's `"_RETURN"` rule? It should be pretty simple http://www.postgresql.org/docs/current/static/sql-createrule.html – pozs May 28 '14 at 09:08
  • can you rephrase your question and elaborate a bit? I do not understand what you mean even after reading the link you posted. – mawimawi May 28 '14 at 09:17
  • Views have a `"_RETURN"` rule by default. You want to replace that to refresh your materialized view before returning view's the original select. Have you tried already? – pozs May 28 '14 at 09:24
  • Never mind, redefining a rule (which exists!) on a materialized view gives the error `rules on materialized views are not supported SQL state: 0A000` – pozs May 28 '14 at 09:35

2 Answers2

171

You should refresh the view in triggers after insert/update/delete/truncate for each statement on table1 and table2.

create or replace function refresh_mat_view()
returns trigger language plpgsql
as $$
begin
    refresh materialized view mat_view;
    return null;
end $$;

create trigger refresh_mat_view
after insert or update or delete or truncate
on table1 for each statement 
execute procedure refresh_mat_view();

create trigger refresh_mat_view
after insert or update or delete or truncate
on table2 for each statement 
execute procedure refresh_mat_view();

In this way your materialized view is always up to date. This simple solution might be hard to accept with frequent inserts/updates and sporadic selects. In your case (seldom changes about twice a day) it ideally fits your needs.


To realize deferred refresh of a materialized view you need one of the following features:

  • asynchronous trigger
  • trigger before select
  • rule on select before

Postgres has none of them, so it seems that there is no clear postgres solution.

Taking this into account I would consider a wrapper function for selects on mat_view, e.g.

CREATE OR REPLACE FUNCTION select_from_mat_view(where_clause text)
RETURNS SETOF mat_view AS $body$
BEGIN
  -- here is checking whether to refresh the mat_view
  -- then return the select:
  RETURN QUERY EXECUTE FORMAT ('SELECT * FROM mat_view %s', where_clause);
END;
$body$ LANGUAGE plpgsql;

If it is acceptable in practice depends on particulars I do not know about.

klin
  • 112,967
  • 15
  • 204
  • 232
  • This would only be feasible for me, if the refresh could be made asynchronously, so a user of the application who works with table1 and table2 does not need to wait unnecessarily long. The users who work with the mat_view can accept to wait a bit longer. – mawimawi May 31 '14 at 07:56
  • 3
    you can simulate an async trigger by replacing the triggers that @klin has defined with a notify command (instead of the refresh_mat_view()). you will need an external process that listens for that notify and runs the refresh_mat_view(). That is outside of postgres reaching back in. It does get rid of the transaction binding / wait for the table1/table2 updater. I like the command at the bottom of this article: http://gonzalo123.com/2012/11/26/sending-sockets-from-postgresql-triggers-with-python/ about using a table to queue async triggers. – Greg Jun 04 '14 at 16:55
  • @Greg - yes, this is pretty much what I am doing myself with the "config" table as I wrote in my question above. – mawimawi Jun 05 '14 at 09:08
  • Although your answer did not really solve my problem, it would be a shame to let the 100 reputation points expire. – mawimawi Jun 05 '14 at 11:51
  • Thanks. I think we'll have *triggers before select* some day. It seems not so difficult to implement but postgres developers have their hands full. – klin Jun 05 '14 at 12:14
  • 4
    Shouldn't this be a SQL feature? Like `CREATE AUTO REFRESH MATERIALIZED VIEW`? – Augustin Riedinger Apr 26 '21 at 10:29
  • 1
    does this answer still hold in 2021? wouldnt refreshing a materialized view recompute the entire table and block all access till the refresh is finished? – PirateApp Jul 27 '21 at 10:08
  • 1
    @PirateApp there is a pg_extension that try to keep the materialized views updated but have different approach than recompute the entire table, check this https://stackoverflow.com/questions/29437650/how-can-i-ensure-that-a-materialized-view-is-always-up-to-date/75123352#75123352 – buncis Jan 15 '23 at 06:39
40

PostgreSQL 9.4 added REFRESH CONCURRENTLY to Materialized Views.

This may be what you're looking for when you describe trying to setup an asynchronous update of the materialized view.

Users selecting from the materialized view will see incorrect data until the refresh finishes, but in many scenarios that use a materialized view, this is an acceptable tradeoff.

Use a statement level trigger that watches the underlying tables for any changes and then refreshes the materialized view concurrently.

Jeff Widman
  • 22,014
  • 12
  • 72
  • 88
  • 3
    `REFRESH CONCURRENTLY` **is not** what the OP is looking for. Refresh with this option still will block the execution until it is finished. In fact on larger datasets it may be slower than simply refresh and additionally needs a unique constraint on the view. [See the documentation](https://www.postgresql.org/docs/9.6/static/sql-refreshmaterializedview.html) – klin Jan 08 '17 at 13:10
  • 8
    ? `REFRESH CONCURRENTLY` will not block execution of reads. Concurrent `SELECT`s simply get the old version of the view until the refresh finishes, then the new copy is swapped in. – Jeff Widman Jan 09 '17 at 19:06
  • See OP's first comment to my answer. He did not want a user who updates the table to wait for the completion of the refresh in the trigger. `REFRESH CONCURRENTLY` does nothing here for the user (it does not block other sessions). – klin Jan 09 '17 at 20:42
  • how do i get notified when REFRESH CONCURRENTLY has finished? i would like to execute a join on the materialized view after it has refreshed – PirateApp Mar 14 '18 at 08:04
  • 1
    @PirateApp You wait until `REFRESH CONCURRENTLY` finishes executing. It allows queries to execute while it's running, but it will still not return control to your program until the refresh is complete. If you want to notify some other thread, you'll have to find another to way to do this. – Michael Mior Sep 28 '18 at 14:55
  • I mean if you want to "wait" until the refresh is completed, you would be using `REFRESH CONCURRENTLY` for nothing :) – arunwithasmile Mar 02 '22 at 20:06