Best way to maintain an ordered list in PostgreSQL?

Question

Say I have a table called list, where there are items like these (the ids are random uuids):

id  rank  text
--- ----- -----
x   0     Hello
x   1     World
x   2     Foo
x   3     Bar
x   4     Baz

I want to maintain the property that rank column always goes from 0 to n-1 (n being the number of rows)---if a client asks to insert an item with rank = 3, then the pg server should push the current 3 and 4 to 4 and 5, respectively:

id  rank  text
--- ----- -----
x   0     Hello
x   1     World
x   2     Foo
x   3     New Item!
x   4     Bar
x   5     Baz

My current strategy is to have a dedicated insertion function add_item(item) that scans through the table, filter out items with rank equal or greater than that of the item being inserted, and increment those ranks by one. However, I think this approach will run into all sorts of problems---like race conditions.

Is there a more standard practice or more robust approach?

Note: The rank column is completely independent of rest of the columns, and insertion is not the only operation I need to support. Think of it as the back-end of a sortable to-do list, and the user can add/delete/reorder the items on the fly.

Can this be a computed column? How is rank determined and can that be an attribute stored so that rank can be computed on demand? — Glenn, Dec 12 '17 at 03:23
looks like a bad idea... why is storing the list order important? The more standard approach would be to just insert un-ordered and only apply an order at the point of output. — Paul Maxwell, Dec 12 '17 at 03:26
@Glenn Unfortunately, no. The rank is completely arbitrary and is in fact part of the data (users might be able to reorder the list on the fly). — xzhu, Dec 12 '17 at 03:28
@TimBiegeleisen Sorry, I just updated the question to make my intention more clear. — xzhu, Dec 12 '17 at 03:32
It is less clear. Are you aware that there is NO internal order to a Postgres table? The only order is the one you impose when querying. — Tim Biegeleisen, Dec 12 '17 at 03:33
@TimBiegeleisen Yes I am aware of that. I think this is what makes this problem tricky. — xzhu, Dec 12 '17 at 03:46
I wouldn't attempt what you are trying to do. Instead, just maintain state from which your desired logical order can be arrived at. — Tim Biegeleisen, Dec 12 '17 at 03:48

Tim Biegeleisen · Answer 1 · 2017-12-12T03:30:28.540

Doing verbatim what you suggest might be difficult or not possible at all, but I can suggest a workaround. Maintain a new column ts which stores the time a record is inserted. Then, insert the current time along with rest of the record, i.e.

id  rank  text      ts
--- ----- -----     --------------------
x   0     Hello     2017-12-01 12:34:23
x   1     World     2017-12-03 04:20:01
x   2     Foo       ...
x   3     New Item! 2017-12-12 11:26:32
x   3     Bar       2017-12-10 14:05:43
x   4     Baz       ...

Now we can easily generate the ordering you want via a query:

SELECT id, rank, text,
    ROW_NUMBER() OVER (ORDER BY rank, ts DESC) new_rank
FROM yourTable;

This would generate 0 to 5 ranks in the above sample table. The basic idea is to just use the already existing rank column, but to let the timestamp break the tie in ordering should the same rank appear more than once.

Thanks Tim. However, this doesn't actually solve my problem. I'm modifying the question to reflect this nuance. — xzhu, Dec 12 '17 at 03:30

score 0 · Answer 2 · answered Dec 12 '17 at 08:27

you can wrap it up to function if you think its worth of:

t=# with u as (
update r set rank = rank + 1 where rank >= 3
)
insert into r values('x',3,'New val!')
;
INSERT 0 1

the result:

t=# select * from r;
 id | rank |   text
----+------+----------
 x  |    0 |   Hello
 x  |    1 |   World
 x  |    2 |   Foo
 x  |    3 | New val!
 x  |    4 |   Bar
 x  |    5 |   Baz
(6 rows)

also worth of mention you might have concurrency "chasing condition" problem on highly loaded systems. the code above is just a sample

score 0 · Answer 3 · answered Dec 12 '17 at 08:39

You can have a “computed rank” which is a double precision and a “displayed rank” which is an integer that is computed using the row_number window function on output.

When a row is inserted that should rank between two rows, compute the new rank as the arithmetic mean of the two ranks.

The advantage is that you don't have to update existing rows.
The down side is that you have to calculate the displayed ranks before you can insert a new row so that you know where to insert it.

This solution (like all others) are subject to race conditions.
To deal with these, you can either use table locks or serializable transactions.

score 0 · Answer 4 · answered May 06 '22 at 04:22

The only way to prevent a race condition would be to lock the table

https://www.postgresql.org/docs/current/sql-lock.html

Of course this would slow you down if there are lots of updates and inserts.

If can somehow limit the scope of your updates then you can do a SELECT .... FOR UPDATE on that scope. For example if the records have a parent_id you can do a select for update on the parent record first and any other insert who does the same select for update would have to wait till your transaction is done.

https://www.postgresql.org/docs/current/explicit-locking.html#:~:text=5.-,Advisory%20Locks,application%20to%20use%20them%20correctly.

Read the section on advisory locks to see if you can use those in your application. They are not enforced by the system so you'll need to be careful of how you write your application.

Best way to maintain an ordered list in PostgreSQL?

4 Answers4