4

I have a table in my PostgreSQL (actually its mulitple tables, but for the sake of simplicity lets assume its just one) and multiple clients that periodically need to query the table to find changed items. These are updated or inserted items (deleted items are handled by first marking them for deletion and then actually deleting them after a grace period).

Now the obvious solution would be to just keep a “modified” timestamp column for each row, remember it for each each client and then simply fetch the changed ones

SELECT * FROM the_table WHERE modified > saved_modified_timestamp;

The modified column would then be kept up to date using triggers like:

CREATE FUNCTION update_timestamp()
    RETURNS trigger
    LANGUAGE ‘plpgsql’
AS $$
    BEGIN
        NEW.modified = NOW();
        RETURN NEW;
    END;
$$;

CREATE TRIGGER update_timestamp_update
    BEFORE UPDATE ON the_table
    FOR EACH ROW EXECUTE PROCEDURE update_timestamp();

CREATE TRIGGER update_timestamp_insert
    BEFORE INSERT ON the_table
    FOR EACH ROW EXECUTE PROCEDURE update_timestamp();

The obvious problem here is that NOW() is the time the transation started. So it might happen that a transaction is not yet commited while fetching the updated rows and when its commited, the timestamp is lower than the saved_modified_timestamp, so the update is never registered.

I think I found a solution that would work and I wanted to see if you can find any flaws with this approach.

The basic idea is to use xmin (or rather txid_current()) instead of the timestamp and then when fetching the changes, wrap them in an explicit transaction with REPEATABLE READ and read txid_snapshot() (or rather the three values it contains txid_snapshot_xmin(), txid_snapshot_xmax(), txid_snapshot_xip()) from the transaction.

If I read the postgres documentation correctly, then all changes made transactions that are < txid_snapshot_xmax() and not in txid_snapshot_xip() should be returned in that fetch transaction. This information should then be all that is required to get all the update rows when fetching again. The select would then look like this, with xmin_version replacing the modified column:

SELECT * FROM the_table WHERE
   xmin_version >= last_fetch_txid_snapshot_xmax OR xmin_version IN last_fetch_txid_snapshot_xip;

The triggers would then be simply like this:

CREATE FUNCTION update_xmin_version()
    RETURNS trigger
    LANGUAGE ‘plpgsql’
AS $$
    BEGIN
        NEW.xmin_version = txid_current();
        RETURN NEW;
    END;
$$;

CREATE TRIGGER update_timestamp_update
    BEFORE UPDATE ON the_table
    FOR EACH ROW EXECUTE PROCEDURE update_xmin_version();

CREATE TRIGGER update_timestamp_update_insert
    BEFORE INSERT ON the_table
    FOR EACH ROW EXECUTE PROCEDURE update_xmin_version();

Would this work? Or am I missing something?

matej.svejda
  • 161
  • 1
  • 9
  • Are you sure you will be retired or at least in a new job before `txid` rolls over? :-) Have you considered using `clock_timestamp()` in your `update`? – Mike Organek Jun 26 '20 at 23:34
  • Hmm... the Postgres docs say: "The internal transaction ID type (xid) is 32 bits wide and wraps around every 4 billion transactions. However, these functions export a 64-bit format that is extended with an “epoch” counter so it will not wrap around during the life of an installation." so I think it should be fine, or am I missing something? clock_timestamp inherently has the same problems as NOW(), only that it happens in fewer cases. – matej.svejda Jun 27 '20 at 07:52
  • Calling `txid_current()` returns the internal transaction ID. On a fairly new database I have, the returned value is `3573`. If there is an epoch counter in there, then I had better grow my sideburns out and install an 8-track tape player in my truck. – Mike Organek Jun 27 '20 at 11:11
  • Apparently, epoch in this context means "add one bit per wraparound": https://stackoverflow.com/questions/49214219/what-is-the-meaning-of-epoch-in-txid-current-in-postgresql – matej.svejda Jun 27 '20 at 12:59
  • And btw: nice 70s joke, made me chuckle ;) – matej.svejda Jun 27 '20 at 17:38

2 Answers2

2

Thank you for the clarification about the 64-bit return from txid_current() and how the epoch rolls over. I am sorry I confused that epoch counter with the time epoch.

I cannot see any flaw in your approach but would verify through experimentation that having multiple client sessions concurrently in repeatable read transactions taking the txid_snapshot_xip() snapshot does not cause any problems.

I would not use this method in practice because I assume that the client code will need to solve for handling duplicate reads of the same change (insert/update/delete) as well as periodic reconciliations between the database contents and the client's working set to handle drift due to communication failures or client crashes. Once that code is written, then using now() in the client tracking table, clock_timestamp() in the triggers, and a grace interval overlap when the client pulls changesets would work for use cases I have encountered.

If requirements called for stronger real-time integrity than that, then I would recommend a distributed commit strategy.

Mike Organek
  • 11,647
  • 3
  • 11
  • 26
  • Thanks for the very detailed response. I have to admit that I also thought the txids I was getting were weird and not epoch-y but was too lazy to dig into that. Your point made me look that up, so thx for that.The client part is actually handled already (with reconnects) and the sync operations are implemented idempotently (assuming they arrive in the right order), so I think it should be fine. – matej.svejda Jun 27 '20 at 17:42
  • I think I prefer the txid approach because with any timestamp approach, even with a grace period, you end up with an implicit limit for Transaction execution time. – matej.svejda Jun 27 '20 at 18:00
1

Ok, so I've tested it in depth now and haven't found any flaws so far. I had around 30 clients writing and reading to the database at the same time and all of them got consistent updates. So I guess this approach works.

matej.svejda
  • 161
  • 1
  • 9