Unique constraint for permutations across multiple columns

Question

Given the following three columns in a Postgres database: first, second, third; how can I create a constraint such that permutations are unique?

E.g. If ('foo', 'bar', 'shiz') exist in the db, ('bar', 'shiz', 'foo') would be excluded as non-unique.

Can your columns be NULL? Can they be empty? Define "unique" then. Your Postgres version? And I don't see a table definition for testing in your question? — Erwin Brandstetter, Aug 21 '13 at 14:57
You might want to upgrade to 9.2.4 to snuff out the security vulnerability patched in 9.2.4. More details at http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-1899 — bma, Aug 21 '13 at 15:07

Jakub Kania · Accepted Answer · 2013-08-21T16:01:16.723

6

You could use hstore to create the unique index:

CREATE UNIQUE INDEX hidx ON test USING BTREE (hstore(ARRAY[a,b,c], ARRAY[a,b,c]));

Fiddle

UPDATE

Actually

CREATE UNIQUE INDEX hidx ON test USING BTREE (hstore(ARRAY[a,b,c], ARRAY[null,null,null]));

might be a better idea since it will work the same but should take less space (fiddle).

edited Aug 21 '13 at 16:01

answered Aug 21 '13 at 15:08

Jakub Kania

15,665
2
37
47

Erwin Brandstetter · Answer 2 · 2019-06-21T01:25:40.653

For only three columns this unique index using only basic expressions should perform very well. No additional modules like hstore or custom function needed:

CREATE UNIQUE INDEX t_abc_uni_idx ON t (
  LEAST(a,b,c)
, GREATEST(LEAST(a,b), LEAST(b,c), LEAST(a,c))
, GREATEST(a,b,c)
);

SQL fiddle

Also needs the least disk space:

SELECT pg_column_size(row(hstore(t))) AS hst_row
      ,pg_column_size(row(hstore(ARRAY[a,b,c], ARRAY[a,b,c]))) AS hst1
      ,pg_column_size(row(hstore(ARRAY[a,b,c], ARRAY[null,null,null]))) AS hst2
      ,pg_column_size(row(ARRAY[a,b,c])) AS arr
      ,pg_column_size(row(LEAST(a,b,c)
                        , GREATEST(LEAST(a,b), LEAST(b,c), LEAST(a,c))
                        , GREATEST(a,b,c))) AS columns
FROM t;

 hst_row | hst1 | hst2 | arr | columns
---------+------+------+-----+---------
      59 |   59 |   56 |  69 |      30

Numbers are bytes for index row in the example in the fiddle, measured with pg_column_size(). My example uses only single characters, the difference in size is constant.

Clever use of GREATEST and LEAST. – bma Aug 21 '13 at 15:35 — bma, Aug 21 '13 at 15:35

score 3 · Answer 3 · edited Nov 03 '14 at 21:15

3

You can do this by creating a unique index on a function which returns a sorted array of the values in the columns:

CREATE OR REPLACE FUNCTION sorted_array(anyarray)
RETURNS anyarray
AS $BODY$
  SELECT array_agg(x) FROM (SELECT unnest($1) AS x FROM test ORDER BY x) AS y;
$BODY$
LANGUAGE sql IMMUTABLE;

CREATE UNIQUE index ON test (sorted_array(array[first,second,third]));

edited Nov 03 '14 at 21:15

Erwin Brandstetter

605,456
145
1,078
1,228

answered Aug 21 '13 at 15:00

qqx

18,947
4
64
68

Might be worth checking what the performance impact of a couple of nested CASE expressions to manually sort would be too. With only 3 columns that should be viable. Might not be worth the extra time though. – Richard Huxton Aug 21 '13 at 15:07
Richard, performance not a driver here. Small app, maybe 20-100 inserts a day. – KevDog Aug 21 '13 at 15:09

score 0 · Answer 4 · answered Aug 21 '13 at 14:59

Suggestion from co-worker, variation of @julien's idea:

Sort the terms alphabetically and place a delimiter on either side of each term. Concatenate them and place them in a separate field that becomes the primary key.

Why the delimiter? So that, "a", "aa", "aaa" and "aa", "aa", "aa" can both be inserted.

Unique constraint for permutations across multiple columns

4 Answers4

UPDATE

Linked