378

I have a table with this layout:

CREATE TABLE Favorites (
  FavoriteId uuid NOT NULL PRIMARY KEY,
  UserId uuid NOT NULL,
  RecipeId uuid NOT NULL,
  MenuId uuid
);

I want to create a unique constraint similar to this:

ALTER TABLE Favorites
ADD CONSTRAINT Favorites_UniqueFavorite UNIQUE(UserId, MenuId, RecipeId);

However, this will allow multiple rows with the same (UserId, RecipeId), if MenuId IS NULL. I want to allow NULL in MenuId to store a favorite that has no associated menu, but I only want at most one of these rows per user/recipe pair.

The ideas I have so far are:

  1. Use some hard-coded UUID (such as all zeros) instead of null.
    However, MenuId has a FK constraint on each user's menus, so I'd then have to create a special "null" menu for every user which is a hassle.

  2. Check for existence of a null entry using a trigger instead.
    I think this is a hassle and I like avoiding triggers wherever possible. Plus, I don't trust them to guarantee my data is never in a bad state.

  3. Just forget about it and check for the previous existence of a null entry in the middle-ware or in a insert function, and don't have this constraint.

I'm using Postgres 9.0. Is there any method I'm overlooking?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Mike Christensen
  • 88,082
  • 50
  • 208
  • 326
  • Why is it that will allow multiple rows with the same (`UserId`, `RecipeId`), if `MenuId IS NULL`? – Drux Jul 22 '18 at 09:29
  • 1
    @Drux I believe that since `Null != Null`, it follows that `(userid, recipieid, null) != (userid, recipieid, null)`. So duplicates will be allowed that look identical to us, but don't compare equal to postgresql. – Jonathan Hartley Oct 22 '20 at 15:06

5 Answers5

586

Postgres 15 or newer

Postgres 15 adds the clause NULLS NOT DISTINCT. The release notes:

  • Allow unique constraints and indexes to treat NULL values as not distinct (Peter Eisentraut)

    Previously NULL values were always indexed as distinct values, but this can now be changed by creating constraints and indexes using UNIQUE NULLS NOT DISTINCT.

With this clause null is treated like just another value, and a UNIQUE constraint does not allow more than one row with the same null value. The task is simple now:

ALTER TABLE favorites
ADD CONSTRAINT favo_uni UNIQUE NULLS NOT DISTINCT (user_id, menu_id, recipe_id);

There are examples in the manual chapter "Unique Constraints".
The clause switches behavior for all keys of the same index. You can't treat null as equal for one key, but not for another.
NULLS DISTINCT remains the default (in line with standard SQL) and does not have to be spelled out.

The same clause works for a UNIQUE index, too:

CREATE UNIQUE INDEX favo_uni_idx
ON favorites (user_id, menu_id, recipe_id) NULLS NOT DISTINCT;

Note the position of the new clause after the key fields.

Postgres 14 or older

Create two partial indexes:

CREATE UNIQUE INDEX favo_3col_uni_idx ON favorites (user_id, menu_id, recipe_id)
WHERE menu_id IS NOT NULL;

CREATE UNIQUE INDEX favo_2col_uni_idx ON favorites (user_id, recipe_id)
WHERE menu_id IS NULL;

This way, there can only be one combination of (user_id, recipe_id) where menu_id IS NULL, effectively implementing the desired constraint.

Possible drawbacks:

  • You cannot have a foreign key referencing (user_id, menu_id, recipe_id). (It seems unlikely you'd want a FK reference three columns wide - use the PK column instead!)
  • You cannot base CLUSTER on a partial index.
  • Queries without a matching WHERE condition cannot use the partial index.

If you need a complete index, you can alternatively drop the WHERE condition from favo_3col_uni_idx and your requirements are still enforced.
The index, now comprising the whole table, overlaps with the other one and gets bigger. Depending on typical queries and the percentage of null values, this may or may not be useful. In extreme situations it may even help to maintain all three indexes (the two partial ones and a total on top).

This is a good solution for a single nullable column, maybe for two. But it gets out of hands quickly for more as you need a separate partial index for every combination of nullable columns, so the number grows binomially. For multiple nullable columns, see instead:

Aside: I advise not to use mixed case identifiers in PostgreSQL.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    @Erwin Brandsetter: regarding the "*mixed case identifiers*" remark: As long as no double quotes are used, using mixed cased identifiers is absolutely fine. There is no difference in using all lowercase identifiers (again: **only** if no quotes are used) –  Nov 27 '11 at 22:07
  • 18
    @a_horse_with_no_name: I assume you know that I know that. That is actually one of the reasons I advise **against** it's usage. People who do not know the specifics so well get confused, as in other RDBMS identifiers are (partly) case sensitive. Sometimes people confuse themselves. Or they build *dynamic SQL* and use *quote_ident()* as they should and forget to pass identifiers as lower case strings now! Do not use mixed case identifiers in PostgreSQL, if you can avoid it. I have seen a number of desperate requests here stemming from this folly. – Erwin Brandstetter Nov 27 '11 at 22:19
  • 3
    @a_horse_with_no_name: Yes, that is of course true. But if you can avoid them: *you don't want mixed case identifiers*. They serve no purpose. If you can avoid them: don't use them. Besides: they are just plain ugly. Quoted identifies are ugly, too. SQL92 identifiers with spaces in them are a misstep made by a committee. Don't use them. – wildplasser Nov 27 '11 at 22:20
  • @wildplasser: *"they server no purpose"*: Avoiding underscores :) – ypercubeᵀᴹ Nov 27 '11 at 22:32
  • Well you need *two* double quotes just to avoid *one* underscore ;-) – wildplasser Nov 27 '11 at 22:38
  • I think it's personal preference. I avoid using quotes always, thus internally in my database everything is lowercase which is what I like. I kinda wish PG had just been case-sensitive everywhere to begin with, but I assume that'll never change :) – Mike Christensen Nov 27 '11 at 22:50
  • @MikeChristensen: not likely to change, no. :) – Erwin Brandstetter Nov 27 '11 at 22:53
  • 3
    @Mike: I think you'd have to talk to the SQL standards committee about that, good luck :) – mu is too short Nov 27 '11 at 22:57
  • @ErwinBrandstetter Great solution. Is the performance & storage requirement of 2 partial indexes comparable to that of a single full index? – user Oct 15 '14 at 19:02
  • 1
    @buffer: Maintenance cost and total storage are basically the same (except for a minor fixed overhead per index). Each row is only represented in one index. Performance: If your results span both cases, an additional total plain index may pay. If not, a partial index is typically faster than a complete index, mainly due to the smaller size. Add the index condition to queries (redundantly) if Postgres doesn't figure out it can use a partial index by itself. [Example.](http://stackoverflow.com/questions/26030354/postgresql-does-not-use-a-partial-index/26031289#26031289) – Erwin Brandstetter Oct 15 '14 at 23:20
  • 5
    Do we really need the `WHERE menu_id IS NOT NULL;` in the first index for the non-null case? Isn't just `CREATE UNIQUE INDEX favorites_3col_uni_idx ON favorites (user_id, menu_id, recipe_id)` the same thing? – Marcus Junius Brutus Nov 06 '14 at 13:58
  • 3
    @MarcusJuniusBrutus: It's a possible alternative, still enforcing the partial uniqueness. It's *not* the same thing though, as the index is over the whole table and therefore bigger. Depending on data distribution and requirements, it may be a good idea or not. It's even possible that all three variants serve their purpose. – Erwin Brandstetter Nov 06 '14 at 14:58
  • Great solution as always, @Erwin Brandsetter! Is there any way to do this with a deferrable unique constraint, though? `ADD CONSTRAINT .. UNIQUE USING INDEX` doesn't allow partial indexes. – EM0 Apr 05 '15 at 18:04
  • @EM: Might be worth another *question*, you can always reference this one for context. – Erwin Brandstetter Apr 05 '15 at 20:11
  • 1
    I think I figured out a solution: use an EXCLUDE constraint with "WITH =" as the operator and COALESCE for the nullable column, eg `EXCLUDE (col1 WITH =, col2 WITH =, COALESCE(nullable_int_col, -1) WITH =) DEFERRABLE` That seems to do the trick! – EM0 Apr 06 '15 at 19:33
  • Plural of "index" is "indices" not "indexes" – Toby 1 Kenobi Sep 03 '19 at 05:27
  • 4
    @Toby1Kenobi: The Latin plural is. But the English plural is more common. – Erwin Brandstetter Sep 03 '19 at 10:35
  • 1
    @ErwinBrandstetter you're right - I've only lived in England, Australia and India, so I didn't realise that in North America "indexes" is acceptable and indeed the more common form there, until I just read up about it. Having said that I also notice that Ruby on Rails agrees with the international usage: `'index'.pluralize` outputs `indices` – Toby 1 Kenobi Sep 03 '19 at 11:38
  • 2
    Is it possible to `insert on conflict do update` with multiple unique partial indicies? As far as I know, only one conflict target can be specified. – No_name Jul 15 '20 at 07:32
  • @No_name looks like it doesn't, see https://stackoverflow.com/questions/46730909/postgresql-partial-unique-index-and-upsert – Pavlus Oct 29 '20 at 17:01
  • @No_name: No, that's currently not possible. For `DO UPDATE`, a (single!) conflict target must be provided, just like you stated. To cover violations from multiple unique constraints / indices, only `DO NOTHING` is possible. See: https://stackoverflow.com/a/42217872/939860 – Erwin Brandstetter Oct 29 '20 at 17:11
  • In my case (with PostgreSQL 12.3) this was not working as expected, but when creating the second index slightly differently using `(menu_id IS NULL)` as column definition it was working: `CREATE UNIQUE INDEX favo_2col_uni_idx ON favorites (user_id, (menu_id IS NULL), recipe_id) WHERE menu_id IS NULL;` – mdt Jul 05 '22 at 06:27
  • @mdt `(menu_id IS NULL)` as index column is utterly useless. There is a misunderstanding somewhere. – Erwin Brandstetter Jul 06 '22 at 23:22
123

You could create a unique index with a coalesce on the MenuId:

CREATE UNIQUE INDEX
Favorites_UniqueFavorite ON Favorites
(UserId, COALESCE(MenuId, '00000000-0000-0000-0000-000000000000'), RecipeId);

You'd just need to pick a UUID for the COALESCE that will never occur in "real life". You'd probably never see a zero UUID in real life but you could add a CHECK constraint if you are paranoid (and since they really are out to get you...):

alter table Favorites
add constraint check
(MenuId <> '00000000-0000-0000-0000-000000000000')
mu is too short
  • 426,620
  • 70
  • 833
  • 800
  • 2
    This carries the (theoretical) flaw, that an entries with menu_id = '00000000-0000-0000-0000-000000000000' can trigger false unique violations - but you already addressed that in your comment. – Erwin Brandstetter Nov 27 '11 at 21:49
  • 2
    I don't think any UUID generation algorithm in existence would come up with that UUID anyway :) Very creative solution. – Mike Christensen Nov 27 '11 at 22:04
  • 2
    @Erwin: Yes, every sentinel based solution suffers that problem, a UUID environment is one of the few places I'd consider it safe enough to use. If you wanted to be paranoid (highly recommended) then `CHECK (MenuId is null or MenuId <> '00000000-0000-0000-0000-000000000000')` could be added. – mu is too short Nov 27 '11 at 23:01
  • 5
    @muistooshort: Yup, that is a proper solution. Simplify to `(MenuId <> '00000000-0000-0000-0000-000000000000')` though. `NULL` is allowed by default. Btw, there is three kinds of people. The paranoid ones, and people who don't do databases. The third kind occasionally posts questions on SO in bewilderment. ;) – Erwin Brandstetter Nov 27 '11 at 23:11
  • 4
    @Erwin: Don't you mean "the paranoid ones and the ones with broken databases"? – mu is too short Nov 27 '11 at 23:18
  • 2
    Which causes the bewilderment of type 3. :) – Erwin Brandstetter Nov 27 '11 at 23:30
  • 5
    This excellent solution makes it very easy to include a null column of a simpler type, such as integer, in a unique constraint. – Markus Pscheidt Sep 25 '15 at 08:53
  • 11
    It's true that a UUID wont come up with that particular string, not only because of the probabilities involved, but also because it's *not a valid UUID*. A UUID generator is not free to use any hex digit in any position, for example one position is reserved for the version number of the UUID. – Toby 1 Kenobi Sep 03 '19 at 05:35
  • 11
    This idea is way simpler and removes the combinatorial problem of multiple nullable fields requiring n^2 partial indexes. This should be the accepted answer. – SDReyes Dec 06 '19 at 15:43
  • 1
    I don't see this working if there is no valid "placeholder" to represent the null value. Eg I have a time data type column that I want to apply this to, and all possible non-null values for that column are valid. Ie there is no `'00000000-0000-0000-0000-000000000000'` equivalent. So I'll use Erwin's solution. – poshest Sep 28 '21 at 11:35
  • 1
    @poshest Yes, all sentinel solutions suffer from this problem. Going with Erwin's solution (any of his really) will almost never be the wrong choice. – mu is too short Sep 28 '21 at 16:57
  • 1
    A big downside to this suggestion is when you want to query ```SELECT UserID from Favorites where UserID = '123e4567-e89b-12d3-a456-426655440000' and MenuId = '123e4567-e89b-12d3-a456-426655440000'``` won't be able to use this index due to the `COALESCE` – Nikolai B May 20 '22 at 14:49
2

You can store favourites with no associated menu in a separate table:

CREATE TABLE FavoriteWithoutMenu
(
  FavoriteWithoutMenuId uuid NOT NULL, --Primary key
  UserId uuid NOT NULL,
  RecipeId uuid NOT NULL,
  UNIQUE KEY (UserId, RecipeId)
)
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
  • An interesting idea. It makes inserting a bit more complicated. I would need to check if a row already exists in `FavoriteWithoutMenu` first. If so, I just add a menu link - otherwise I create the `FavoriteWithoutMenu` row first and then link it to a menu if necessary. It also makes selecting all the favorites in one query very difficult: I'd have to do something weird like select all the menu links first, and then select all the Favorites whose IDs don't exist within the first query. I'm not sure if I like that. – Mike Christensen Nov 27 '11 at 21:40
  • I don't think inserting as more complicated. If you want to insert a record with `NULL MenuId`, you insert into this table. If not, to the `Favorites` table. But querying, yes, it will be more complicated. – ypercubeᵀᴹ Nov 27 '11 at 21:43
  • Actually scratch that, selecting all favorites would just be a single LEFT join to get the menu. Hmm yea this might be the way to go.. – Mike Christensen Nov 27 '11 at 21:44
  • The INSERT becomes more complicated if you want to add the same recipe to more than one menu, since you have a UNIQUE constraint on UserId/RecipeId on FavoriteWithoutMenu. I'd need to create this row only if it didn't exist already. – Mike Christensen Nov 27 '11 at 21:48
  • We might be talking about two different things. I interpreted your answer as normalizing the relationships between favorites and menus. Favorites would be the set of recipes a user has favorited. Menus would be an optional link between a favorite and a menu. Null menus would simply have no row in the Menus table. – Mike Christensen Nov 27 '11 at 21:50
  • Yes, essentially my idea is to normalize the tables. If you can supply an overview of the tables and relationships you have, it would help us understand better. – ypercubeᵀᴹ Nov 27 '11 at 21:55
  • 1
    Thanks! This answer deserves a +1 since it's more of a cross-database pure SQL thing.. However, in this case I'm gonna go the partial index route because it requires no changes to my schema and I like it :) – Mike Christensen Nov 27 '11 at 22:03
  • Use `VIEW` to unify both tables when doing `SELECT` – Evgeny Nozdrev Jul 06 '21 at 10:45
  • It is useful when creating new table. But it is quite impossible to do in existing table where already null value is available. – Raju Ahmed Oct 26 '22 at 13:03
0

I believe there is an option that combines the previous answers into a more optimal solution.

create table unique_with_nulls (
    id serial not null,
    name varchar not null,
    age int2 not null,
    email varchar,
    email_discriminator varchar not null generated always as ( coalesce(email::varchar, 0::varchar) ) stored,

    constraint uwn_pkey primary key (id)
);

create unique index uwn_name_age_email_uidx on unique_with_nulls(name, age, email_discriminator);

What happens here is that the column email_discriminator will be generated at "insert-or-update-time", as either an actual email, or "0" if the former one is null. Then, your unique index must target the discriminator column.
This way we don't have to create two partial indexes, and we don't loose the ability to use indexed scans on name and age selection only.
Also, you can keep the type of the email column and we don't have any problems with the coalesce function, because email_discriminator is not a foreign key. And you don't have to worry about this column receiving unexpected values because generated columns cannot be written to.

I can see three opinionated drawbacks in this solution, but they are all fine for my needs:

  • the duplication of data between the email and email_discriminator.
  • the fact that I must write to a column and read from another.
  • the need to find a value that is outside the set of acceptable values of email to be the fallback one (and sometimes this could be hard to find or even subjective).
Teodoro
  • 1,194
  • 8
  • 22
-2

I think there is a semantic problem here. In my view, a user can have a (but only one) favourite recipe to prepare a specific menu. (The OP has menu and recipe mixed up; if I am wrong: please interchange MenuId and RecipeId below) That implies that {user,menu} should be a unique key in this table. And it should point to exactly one recipe. If the user has no favourite recipe for this specific menu no row should exist for this {user,menu} key pair. Also: the surrogate key (FaVouRiteId) is superfluous: composite primary keys are perfectly valid for relational-mapping tables.

That would lead to the reduced table definition:

CREATE TABLE Favorites
( UserId uuid NOT NULL REFERENCES users(id)
, MenuId uuid NOT NULL REFERENCES menus(id)
, RecipeId uuid NOT NULL REFERENCES recipes(id)
, PRIMARY KEY (UserId, MenuId)
);
wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • 2
    Yea this is right. Except, in my case I want to support having a favorite that doesn't belong to any menu. Imagine it like your Bookmarks in your browser. You might just "bookmark" a page. Or, you could create sub-folders of bookmarks and title them different things. I want to allow users to favorite a recipe, or create sub-folders of favorites called menus. – Mike Christensen Nov 27 '11 at 23:32
  • 1
    As I said: it is all about semantics. (I was thinking about food, obviously) Having a favourite "that does not belong to any menu" makes no sense to me. You cannot favour something that does not exist, IMHO. – wildplasser Nov 27 '11 at 23:36
  • Seems like some db normalization could help. Create a second table that relates recipes to menus (or not). Though it generalizes the problem and allows for more than one menu that a recipe could be part of. Regardless, the question was about unique indexes in PostgreSQL. Thanks. – Chris Jul 19 '18 at 16:14