6

Does PostgreSQL 9.2+ provide any functionality to make it possible to generate a sequence that is namespaced to a particular value? For example:

 .. | user_id | seq_id | body | ...
 ----------------------------------
  - |    4    |   1    |  "abc...."
  - |    4    |   2    |  "def...."
  - |    5    |   1    |  "ghi...."
  - |    5    |   2    |  "xyz...."
  - |    5    |   3    |  "123...."

This would be useful to generate custom urls for the user:

domain.me/username_4/posts/1    
domain.me/username_4/posts/2

domain.me/username_5/posts/1
domain.me/username_5/posts/2
domain.me/username_5/posts/3

I did not find anything in the PG docs (regarding sequence and sequence functions) to do this. Are sub-queries in the INSERT statement or with custom PG functions the only other options?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
dgo.a
  • 2,634
  • 23
  • 35

4 Answers4

2

You can use a subquery in the INSERT statement like @Clodoaldo demonstrates. However, this defeats the nature of a sequence as being safe to use in concurrent transactions, it will result in race conditions and eventually duplicate key violations.

You should rather rethink your approach. Just one plain sequence for your table and combine it with user_id to get the sort order you want.

You can always generate the custom URLs with the desired numbers using row_number() with a simple query like:

SELECT format('domain.me/username_%s/posts/%s'
            , user_id
            , row_number() OVER (PARTITION BY user_id ORDER BY seq_id)
             )
FROM   tbl;

db<>fiddle here
Old sqlfiddle

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

Maybe this answer is a little off-piste, but I would consider partitioning the data and giving each user their own partitioned table for posts.

There's a bit of overhead to the setup as you will need triggers for managing the DDL statements for the partitions, but would effectively result in each user having their own table of posts, along with their own sequence with the benefit of being able to treat all posts as one big table also.

General gist of the concept...

psql# CREATE TABLE posts (user_id integer, seq_id integer);
CREATE TABLE

psql# CREATE TABLE posts_001 (seq_id serial) INHERITS (posts);
CREATE TABLE

psql# CREATE TABLE posts_002 (seq_id serial) INHERITS (posts);
CREATE TABLE

psql# INSERT INTO posts_001 VALUES (1);
INSERT 0 1

psql# INSERT INTO posts_001 VALUES (1);
INSERT 0 1

psql# INSERT INTO posts_002 VALUES (2);
INSERT 0 1

psql# INSERT INTO posts_002 VALUES (2);
INSERT 0 1

psql# select * from posts;
 user_id | seq_id 
---------+--------
       1 |      1
       1 |      2
       2 |      1
       2 |      2
(4 rows)

I left out some rather important CHECK constraints in the above setup, make sure you read the docs for how these kinds of setups are used

Chris Farmiloe
  • 13,935
  • 5
  • 48
  • 57
  • Nice idea. +1 However, inheritance does *not* play well with foreign keys, for instance. – Erwin Brandstetter May 09 '13 at 21:57
  • @Erwin One table per user? Each table is a file in a directory. A very large number of files in a directory can have consequences. I would not try that in a serious application. – Clodoaldo Neto May 10 '13 at 13:46
  • True. partitioning is not really recommended for many thousands of partitions. However not because of large numbers of files (unless you are on a pretty outdated file system), but because the query planner will start to struggle. – Chris Farmiloe May 10 '13 at 13:56
0
insert into t values (user_id, seq_id) values
(4, (select coalesce(max(seq_id), 0) + 1 from t where user_id = 4))

Check for a duplicate primary key error in the front end and retry if needed.

Update

Although @Erwin advice is sensible, that is, a single sequence with the ordering in the select query, it can be expensive.

If you don't use a sequence there is no defeat of the nature of the sequence. Also it will not result in a duplicate key violation. To demonstrate it I created a table and made a python script to insert into it. I launched 3 parallel instances of the script inserting as fast as possible. And it just works.

The table must have a primary key on those columns:

create table t (
    user_id int,
    seq_id int,
    primary key (user_id, seq_id)
);

The python script:

#!/usr/bin/env python

import psycopg2, psycopg2.extensions

query = """
    begin;
    insert into t (user_id, seq_id) values
    (4, (select coalesce(max(seq_id), 0) + 1 from t where user_id = 4));
    commit;
"""

conn = psycopg2.connect('dbname=cpn user=cpn')
conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE)
cursor = conn.cursor()

for i in range(0, 1000):

    while True:
        try:
            cursor.execute(query)
            break
        except psycopg2.IntegrityError, e:
            print e.pgerror
            cursor.execute("rollback;")

cursor.close()
conn.close()

After the parallel run:

select count(*), max(seq_id) from t;
 count | max  
-------+------
  3000 | 3000

Just as expected. I developed at least two applications using that logic and one of then is more than 13 years old and never failed. I concede that if you are Facebook or some other giant then you could have a problem.

Community
  • 1
  • 1
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • Note the downvote was not mine. It's not the route I would advice, but it's a perfectly valid answer. Actually +1, since you put in some more effort now. – Erwin Brandstetter May 10 '13 at 15:52
-2

Yes:

CREATE TABLE your_table
(
    column type DEFAULT NEXTVAL(sequence_name),
    ...
);

More details here: http://www.postgresql.org/docs/9.2/static/ddl-default.html

Federico Razzoli
  • 4,901
  • 1
  • 19
  • 21