15

Say I have a table where I want to use a serial as primary key to ask for changes from the client. The client will ask "give me the changes after key X". Without using SERIALIZABLE isolation level or locking, this is prone to race conditions.

Transaction A can start first, and do its writes, then take a long time to commit. Meanwhile transaction B will start and commit, before A commits. The write from B will get a higher primary key than the write from A. If a client now asks for changes it will miss the still uncommitted write from A, and note the newest highest primary key. So even after A commits, the client will never see that change, because its key is lower than the change the client already got.

Is it possible to make the value of a serial (or similar from a counter) be determined atomically at commit time so that we are guaranteed that it will be higher than all others when committed, and lower than all which will be committed after it? If not what is the best way to solve this problem?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
NotNull
  • 307
  • 2
  • 9
  • Without commit timestamps to provide a consistent global ordering this is actually pretty hard to do in the presence of concurrent inserts. PostgreSQL does it at the xid level for snapshots by keeping track of the oldest and newest current xids and, within that, a bitmap of which are committed. I don't think you can really translate this to use of a sequence, though. – Craig Ringer Nov 27 '15 at 05:40
  • Possible duplicate of [In-order sequence generation](https://stackoverflow.com/questions/17500013/in-order-sequence-generation) – Philippe Sep 24 '18 at 15:43

1 Answers1

12

Postgres 9.5 introduced a new feature related to this problem: commit timestamps.

You just need to activate track_commit_timestamp in postgresql.conf (and restart!) to start tracking commit timestamps. Then you can query:

SELECT * FROM tbl
WHERE  pg_xact_commit_timestamp(xmin) >= '2015-11-26 18:00:00+01';

Read the chapter "Commit timestamp tracking" in the Postgres Wiki.
Related utility functions in the manual.

Function volatility is only VOLATILE because transaction IDs (xid) can wrap around per definition. So you cannot create a functional index on it.
You could fake IMMUTABLE volatility in a function wrapper for applications in a limited time frame, but you need to be aware of implications. Related case with more explanation:

For many use cases (like yours?) that are only interested in the sequence of commits (and not absolute time) it might be more efficient to work with xmin cast to bigint "directly" (xmin::text::bigint) instead of commit timestamps. (xid is an unsigned integer internally, the upper half that does not fit into a signed integer.) Again, be aware of limitations due to possible xid wraparound.

For the same reason, commit timestamps are not preserved indefinitely. For small to medium databases, xid wraparound hardly ever happens - but it will eventually if the cluster is live for long enough. Read the chapter "Preventing Transaction ID Wraparound Failures" in the manual for details.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Interesting. If I add "ORDER BY pg_xact_commit_timestamp(xmin)" will it also be efficient to perform a query like that, as if I'm using an indexed column? – NotNull Nov 26 '15 at 21:25
  • 1
    @NotNull: The function itself has index support to retrieve the timestamp, but I am afraid you cannot create a functional index on the result. I added some more above. – Erwin Brandstetter Nov 27 '15 at 00:55
  • Thanks @ErwinBrandstetter for sheding some light on this. `xmin` is a system column that doesn't belong specifically to a table. I wouldn't create an index on that value, as it's global and can grow pretty huge, but that's my opinion. Let's see what Erwin has tosay about this. Anyways, I'm thinking it may not be possible to create such index. – Kamil Gosciminski Nov 27 '15 at 00:56
  • 2
    @ConsiderMe What huh? Every table has `xmin`, and it's a hidden system column, but it's not somehow shared across all tables. The reason you can't really index it is xid wraparound, not anything to do with it being somehow global. – Craig Ringer Nov 27 '15 at 06:09
  • 1
    So it seems like the commit timestamp can be used to fetch only changes that have happened after last query, if this is not too long ago. But it can't really be a general solution, because the xmin will not be unique over time. – NotNull Nov 27 '15 at 14:37
  • 1
    A general solution might be to just take an advisory lock (on an ID which identifies all the rows each client is interested in) in transactions which modify a table that needs to be queried in this way, in order to protect the primary key from race conditions like I described. – NotNull Nov 27 '15 at 14:38
  • @CraigRinger I stand corrected. Thanks for the clarification. – Kamil Gosciminski Nov 27 '15 at 17:48
  • @ErwinBrandstetter can you expand on how you can use `xmin` to track the sequence of commits (or provide a link) and also how is this different / is it different from storing the `txid_current()` when you do a create/update/delete? Also I was under the impression that even with `pg_xact_commit_timestamp` you don't have guarantees on ordering: https://stackoverflow.com/questions/56961111/questions-about-postgres-track-commit-timestamp-pg-xact-commit-timestamp – Jordan Jul 29 '20 at 20:04