Best alternate of SQL Server rowversion (IROW_VERSION) in PostgreSQL?

Question

I have a row version column in SQL Server table. At the time of update and insert command auto generate value automatically inserted/updated in this column. I need same in PostgreSQL. One option is to use function trigger with sequence generation but it may cause locking issue. What is the best practice/alternate in PostgreSQL.

A serial would work perfectly fine for the job. Rowversion in SQL Server is database wide increasing integer. — Cetin Basoz, Jun 03 '20 at 11:30
Do you mean `rowversion`? That's a specific type used in optimistic concurrency. Npgsql [uses xmin](https://www.npgsql.org/efcore/modeling/concurrency.html) for the same job — Panagiotis Kanavos, Jun 03 '20 at 11:31
@CetinBasoz it wouldn't. A `rowversion` is *not* a global auto ID, it can have gaps or appear to be out of order. The only real guarantee is that for the *same* row, every modification will change the `rowversion`. For perf reasons, different threads can use batches of `rowversions` to avoid synchronisation across workers. — Panagiotis Kanavos, Jun 03 '20 at 11:33
@PanagiotisKanavos, we are telling the same thing with different sentences. Then here is the official explanation "Is a data type that exposes automatically generated, unique binary numbers within a database. rowversion is generally used as a mechanism for version-stamping table rows. The storage size is 8 bytes. The rowversion data type is just an incrementing number and does not preserve a date or a time. " — Cetin Basoz, Jun 03 '20 at 11:35
@PanagiotisKanavos: "*The rowversion data type is just an incrementing number and does not preserve a date or a time.*" sounds very much like a value taken from a sequence. — , Jun 03 '20 at 11:35
It's not. That's why there are gaps and out-of-order values. Both `serial` and SQL Server's SEQUENCE ensure atomic, monotonically incrementing values. Rowversion is neither. I'm not assuming here - I *have* tried to use `rowversion` for syncing in the past and found all those problems. That's why Npgsql uses `xmin` for concurrency tokens instead `serial` or sequences — Panagiotis Kanavos, Jun 03 '20 at 11:39
@PanagiotisKanavos If you have to prevent gaps, then the "increment per update" (as in my first function) will exactly do that. But if you have a better solution, the you should add that as an answer — , Jun 03 '20 at 11:47
That's why I want to know *what* the OP wants to do. The equivalent concurrency token used by Npgsql is to use `xmin`. If the OP wants an incrementing value though, `rowversion` in SQL Server was a bad choice to begin with. — Panagiotis Kanavos, Jun 03 '20 at 11:52
@PanagiotisKanavos, row version too has gaps and having or not having gaps, out-of-order values is not important, it is there to support a "row versioning" only. IOW the only thing really matters is the latest value is larger than the previous ones. Gaps would naturally exists, because it is increased in every insert and update. — Cetin Basoz, Jun 03 '20 at 11:52
@PanagiotisKanavos, and yes as you said, row version in SQL server may not be a good choice (I am not even sure if it was their decision or Sybase's where they borrowed the SQL Server). — Cetin Basoz, Jun 03 '20 at 11:54
I said it's only used for optimistic concurrency. If that's what the OP wants, the equivalent is offered by `xmin`. If not - what does the OP want? — Panagiotis Kanavos, Jun 03 '20 at 11:58
@PanagiotisKanavos, you can use a sequence for optimistic concurrency. Unnecessarily getting long. — Cetin Basoz, Jun 03 '20 at 12:02

score 3 · Answer 1 · edited Jun 20 '20 at 09:12

The question is somewhat unclear. rowversion in SQL Server is only used as a concurrency token in optimistic concurrency scenarios. This is faster than a trigger that updates a LastModified timestamp or increments a stored column.

The equivalent in PostgreSQL is the system-provided xmin column:

xmin

The identity (transaction ID) of the inserting transaction for this row version. (A row version is an individual state of a row; each update of a row creates a new row version for the same logical row.)

Essentially, for a single row, xmin always changes after every modification, just like rowversion does. It's faster than a trigger too, since it requires no extra effort.

The NpgSQL provider for Entity Framework uses xmin as a concurrency token.

If you want to implement optimistic concurrency manually, read the xmin column in your SELECT statement and use that value in updates, eg:

SELECT xmin, ID, Name FROM sometable;

Which returns

 xmin | ID | name
------+----+------
  123 | 23 | Moo

And then

UPDATE sometable 
SET name = 'Foo' 
WHERE ID = 23 AND xmin = 123

If the row was modified by some other transaction, xmin won't match and no changes will be made. You can detect that by checking how many rows were changed using your provider's API. That's how rowversion works too.

Another possibility mentioned in the linked question is to use the RETURNING clause to return some value to the client. If no value is returned, the statement failed, eg:

UPDATE sometable 
SET name = 'Foo' 
WHERE ID = 23 AND xmin = 123
RETURNING 1

What are the practical consequences of [vacuuming](https://www.postgresql.org/docs/13/routine-vacuuming.html) for `xmin` when used for optimistic locking/concurrency? In [this answer](https://stackoverflow.com/a/1740923/3760986) there is a comment that warns about it. The Postgres documentation is somewhat lengthy and complex, talking about `FrozenTransactionId` and `BootstrapTransactionId`, which I do not understand. Could you elaborate on what that means in practice, please? — Joerg, Jun 03 '21 at 08:30

score 1 · Accepted Answer · 2020-06-03T11:37:33.687

1

Not sure what "locking issue" you are talking about, but you can get something equivalent (assuming I understood the "row version" thing correctly) without a sequence:

create table some_table
(
  .... columns of the table ..., 
  row_version bigint not null default 0
);

Then create a trigger function:

create function increment_row_version()
  returns trigger
as
$$
begin
  new.row_version := old.row_version + 1;
  return new;
end;
$$
language plpgsql;

By using the old record, it's impossible to overwrite the row version value in an UPDATE statement.

And then create the trigger for every table where you need it:

create trigger update_row_version_trigger
  before update on your_table_name_here
  for each row
  execute procedure increment_row_version();

If you also want to prevent inserting e.g. a higher number as the start number, you can extend the trigger to run on insert and update and in case of an insert assign the desired start value explicitly.

If you need a global value across all tables (rather than one number for each table as the above does), create a sequence and use nextval() inside the trigger rather than incrementing the value. And no, the use of a sequence will not cause "locking issues".

The trigger would then look like this:

create function increment_row_version()
  returns trigger
as
$$
begin
  new.row_version := nextval('global_row_version_sequence');
  return new;
end;
$$
language plpgsql;

and can be used for both insert and update triggers.

edited Jun 03 '20 at 11:37

answered Jun 03 '20 at 11:27

That's not the same thing as SQL Server's `rowversion`, which is used in optimistic concurrency scenarios. The equivalent in PostgreSQL seems to be xmin, [used by NpgSQL for just that scenario](https://www.npgsql.org/efcore/modeling/concurrency.html) – Panagiotis Kanavos Jun 03 '20 at 11:34
@PanagiotisKanavos: if the rowversion is simply an incrementing number, then a value taken from a sequence seems to be exactly the same thing. For which purpose that number is used is a different question. But that could absolutely be used for optimistic locking (I use the table based "version number" quite frequently) – Jun 03 '20 at 11:39
It's not though. It's *only* meant for optimistic locking. Back in the 2005 days (or even earlier), `timestamp`, the depredated name for `rowversion` was an incrementing number indeed, but that caused too much sync overhead in high-traffic systems, and paralleled operations. Now, each worker can use its own batches of values, resulting in gaps and out-of-order values. Like many, I've tried to use `rowversion` for syncing and found the hard way there were no guarantees – Panagiotis Kanavos Jun 03 '20 at 11:42
A database version *is* used with [change tracking](https://learn.microsoft.com/en-us/sql/relational-databases/track-changes/about-change-tracking-sql-server?view=sql-server-ver15), but that's a completely different feature and that value simply isn't used to generate `rowversion`. Essentially, it's the change table ID for a change record generated when rows are modified in a table with `Change Tracking` enabled – Panagiotis Kanavos Jun 03 '20 at 11:44
@PanagiotisKanavos: sorry I don't get it If `rowversion` is an incrementing number, then my solution absolutely does this. You can either have one global sequence or an incrementing number per table (that can _also_ be done using a sequence per table, but I don't see the reason why) – Jun 03 '20 at 11:44
@a_horse_with_no_name, I have done the same thing and it worked for me but the question is that sequence number is database level increment. If multiple user insert/update data on same time then is there any locking on sequence number at database level to avoid concurrency issue? Does PostgreSQL handle this scenario or it will fail? – FaizUR Rahman Jun 04 '20 at 06:15
@FaizURRahman: no, there won't be any noticeable locks on the sequence. They are specifically designed for this scenario. – Jun 04 '20 at 06:17
@a_horse_with_no_name ,@PanagiotisKanavos: Thanks a lot. I have implemented this in PostgreSQL – FaizUR Rahman Jun 04 '20 at 06:44

Best alternate of SQL Server rowversion (IROW_VERSION) in PostgreSQL?

2 Answers2