As I understand it, you want a UNIQUE
index over a
and b
combined.
You update narrowed it down (b
shall not exist in a
). This solution is stricter.
Solution
After trying and investigating quite a bit (see below!) I came up with this:
ALTER TABLE tbl ADD CONSTRAINT a_not_equal_b CHECK (a <> b);
ALTER TABLE tbl ADD CONSTRAINT ab_unique
EXCLUDE USING gist ((ARRAY[hashtext(COALESCE(a, ''))
, hashtext(COALESCE(b, ''))]) gist__int_ops WITH &&);
db<>fiddle here
Since the exclusion constraint won't currently (pg 12) work with text[]
, I work with int4[]
of hash values. hashtext()
is the built-in hash function that's also used for hash-partitioning (among other uses). Seems perfect for the job.
The operator class gist__int_ops
is provided by the additional module intarray, which has to be installed once per database. Its optional, the solution works with the default array operator class as well. Just drop gist__int_ops
to fall back. But intarray is faster. Related:
Caveats
int4
may not be big enough to rule out hash collisions sufficiently. You might want to go with bigint
instead. But that's more expensive and can't use the gist__int_ops
operator class to improve performance. Your call.
Unicode has the dismal property that equal strings can be encoded in different ways. If you work with Unicode (typical encoding UTF8) and use non-ASCII characters (and this matters to you), compare normalized forms to rule out such duplicates. The upcoming Postgres 13 adds the function normalize()
for that purpose. This is a general caveat of character type duplicates, though, not specific to my solution.
NULL
value is allowed, but collides with empty string (''
). I would rather go with NOT NULL
columns and drop COALESCE()
from the expressions.
Obstacle course to an exclusion constraint
My first thought was: exclusion constraint. But it falls through:
ALTER TABLE tbl ADD CONSTRAINT ab_unique EXCLUDE USING gist ((ARRAY[a,b]) WITH &&);
ERROR: data type text[] has no default operator class for access method "gist"
HINT: You must specify an operator class for the index or define a default operator class for the data type.
There is an open TODO item for this. Related:
But can't we use a GIN index for text[]
? Alas, no:
ALTER TABLE tbl ADD CONSTRAINT ab_unique EXCLUDE USING gin ((ARRAY[a,b]) WITH &&);
ERROR: access method "gin" does not support exclusion constraints
Why? The manual:
The access method must support amgettuple
(see Chapter 61); at present this means GIN cannot be used.
It seems hard to implement, so don't hold your breath.
If a
and b
were integer
columns, we could make it work with an integer array:
ALTER TABLE tbl ADD CONSTRAINT ab_unique EXCLUDE USING gist ((ARRAY[a,b]) WITH &&);
Or with the gist__int_ops
operator class from the additional module intarray (typically faster):
ALTER TABLE tbl ADD CONSTRAINT ab_unique EXCLUDE USING gist ((ARRAY[a,b]) gist__int_ops WITH &&);
To also forbid duplicates within the same row, add a CHECK
constraint:
ALTER TABLE tbl ADD CONSTRAINT a_not_equal_b CHECK (a <> b);
Remaining issue: Does work with NULL
values.
Workaround
Add a helper table to store values from a
and b
in one column:
CREATE TABLE tbl_ab(ab text PRIMARY KEY);
Main table, like you had it, plus FK constraints.
CREATE TABLE tbl (
a text REFERENCES tbl_ab ON UPDATE CASCADE ON DELETE CASCADE
, b text REFERENCES tbl_ab ON UPDATE CASCADE ON DELETE CASCADE
);
Use a function like this to INSERT
:
CREATE OR REPLACE FUNCTION f_tbl_insert(_a text, _b text)
RETURNS void
LANGUAGE sql AS
$func$
WITH ins_ab AS (
INSERT INTO tbl_ab(ab)
SELECT _a WHERE _a IS NOT NULL -- NULL is allowed (?)
UNION ALL
SELECT _b WHERE _b IS NOT NULL
)
INSERT INTO tbl(a,b)
VALUES (_a, _b);
$func$;
db<>fiddle here
Or implement a trigger to take care of it in the background.
CREATE OR REPLACE FUNCTION trg_tbl_insbef()
RETURNS trigger AS
$func$
BEGIN
INSERT INTO tbl_ab(ab)
SELECT NEW.a WHERE NEW.a IS NOT NULL -- NULL is allowed (?)
UNION ALL
SELECT NEW.b WHERE NEW.b IS NOT NULL;
RETURN NEW;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER tbl_insbef
BEFORE INSERT ON tbl
FOR EACH ROW EXECUTE PROCEDURE trg_tbl_insbef();
db<>fiddle here
NULL
handling can be changed as desired.
Either way, while the added (optional) FK constraints enforce that we can't sidestep the helper table tbl_ab
, and allow to UPDATE
and DELETE
in tbl_ab
to cascade, you still need to project UPDATE
and DELETE
into the helper table as well (or implement more triggers). Tricky corner cases, but there are solutions. Not going into this, after I found the solution above with an exclusion constraint using hashtext()
...
Related: