9

My team needs a serial column to increase monotonically with each commit.

There are cases in which two transactions get values 100 and 101 from a sequence, then the 100 transaction takes longer to commit, so the value 101 gets committed first, followed by 100. This situation is problematic for our needs, and we need to solve for it.

This is exactly the problem described by this question. We need a solution that does not require changes to the database configuration, unlike the accepted answer to that question.

A comment on that solution, as well as this post, suggest using an exclusive transactional advisory lock starting just before the value is acquired.

Is there a way to have Postgres automatically acquire this lock and fetch a value from a sequence when it gets an INSERT to the table in question?

Note: gaps in the committed values are OK and expected.

EDIT: I have dug into this question enough to be able to ask it well, but I am not very experienced with Postgres. I'm hoping for a pointer to a specific trigger or whatever specific PG setup will accomplish this.

rep
  • 1,546
  • 1
  • 12
  • 19
  • Maybe [txid_current()](https://www.postgresql.org/docs/current/static/functions-info.html#FUNCTIONS-TXID-SNAPSHOT) But what is the problem with `alter system set track_commit_timestamp = true` as suggested in one of the linked questions? –  Aug 24 '17 at 16:25
  • @a_horse_with_no_name "This parameter can only be set in postgresql.conf file or on the server command line." This will be going out to (numerous) environments without access to either. Thanks for that pointer—not what I was expecting, and we'll have to ensure only one row gets committed per transaction, but you're right, it does fit. – rep Aug 24 '17 at 16:55
  • All parameters from `postgresql.conf` can be changed using `alter system`. If you don't have access to those database server, then how do you create tables there? There must be a way to run DDL statements for them. –  Aug 24 '17 at 16:58
  • I didn't know about `alter system`. This is for an open source tool. Our ideal solution would not change the system configuration, but we can look into that. – rep Aug 24 '17 at 17:06
  • @a_horse_with_no_name txid_current() doesn't work because we specifically need the order of commit, not the order of transaction start. – rep Aug 24 '17 at 17:36
  • 1
    Then use a `timestamp` –  Aug 24 '17 at 17:53
  • Why don't you want to use the advisory lock explicitly? In my opinion that should be the way to go. Are the problems, that you don't know exactly where the inserts will happen? Or don't you know exactly where the commits will happen? I would not change the system configuration to satisfy business-requirements if you can do it with normal postgres means. Especially in the case of open source I think you don't want to be disqualified by grumpy database administration people. – aschoerk Aug 25 '17 at 07:02
  • Possible duplicate of [In-order sequence generation](https://stackoverflow.com/questions/17500013/in-order-sequence-generation) – Philippe Sep 24 '18 at 15:44

2 Answers2

3

I know you wanted an automatic locking, I would advise against that, you might be able to use stored procedures or triggers for that, but what else is a stored procedure than code. Please see also my comment.

My solution would be to:

My team needs a serial column to increase monotonically with each commit.

Does that mean,

  • that the insertion of a value less then the maximum value already stored is not allowed?
  • gaps in the values of that column are not allowed

I suppose you use a sequence for creation of this value. Then immediately before the commit you should acquire a specific advisory lock see 13.3.4 and now do your insertion either use the sequence implicitly in your schema or explicitly by querying in your insertion. No other commit of a transaction trying to acquire the same lock can get between the locking and the commit so the insertion must be sequential. Doing the locking and incrementing at the end of the transaction helps in that, to keep the time short and prevent deadlocks. The lock will be released together with the commit and the next transaction may acquire it, and will get the next value of the sequence.

aschoerk
  • 3,333
  • 2
  • 15
  • 29
  • Right, we need each newly-committed value to be greater than any previously committed value. Gaps are not a problem, though it would be nice to have something that doesn't do massive jumps by design. Your comment about using the lock explicitly makes sense. Thanks. – rep Aug 28 '17 at 20:35
  • Could you please share an example INSERT? – Denis Mikhaylov Jan 09 '19 at 09:06
  • @Denis Mikhaylov If you refer to the advisory lock, please follow the link, it is no insert. – aschoerk Jan 09 '19 at 13:46
0

We've had to deal this as well, and we went for the stored procedure solution:

CREATE OR REPLACE FUNCTION proc_set_primary_key()
    RETURNS TRIGGER
    LANGUAGE plpgsql
AS
$$
DECLARE
    base_index BIGINT;
    update_count INTEGER;
BEGIN
    PERFORM pg_advisory_xact_lock(42);
    base_index := NEXTVAL('id_sequence');
    UPDATE event SET id = base_index - id - 1 WHERE id < 0;
    GET DIAGNOSTICS update_count = ROW_COUNT;
    PERFORM SETVAL('id_sequence', base_index + update_count - 1);

    RETURN NULL;
END;
$$;

CREATE CONSTRAINT TRIGGER set_commit_order
    AFTER INSERT ON event
    DEFERRABLE INITIALLY DEFERRED
    FOR EACH ROW
    WHEN ( new.id = -1 )
EXECUTE PROCEDURE proc_set_primary_key();

When doing (batch) INSERTs number all your id's with temporary negative values starting from -1. The stored procedure will renumber them just before commit. This creates id's in commit order and preserves the exact INSERT order, but is not gapless.

Note that each transaction always starts numbering id's from -1. so if you never insert more than one record per transaction, always insert it with -1.

Be careful if you need to know the value of the id's afterwards, as they're changed from their temporary values. A framework like Hibernate for example may need to have its cache cleared of these records.

Although it may seem that multiple concurrent transactions would have conflicting negative id's, as they're not visible to other transactions and as they're resolved to real id's before commit, this does not cause any problems.

john16384
  • 7,800
  • 2
  • 30
  • 44