Allow null in unique column

Question

I've created the following table:

CREATE TABLE MMCompany (
   CompanyUniqueID BIGSERIAL PRIMARY KEY NOT NULL, 
   Name VARCHAR (150) NOT NULL,
   PhoneNumber VARCHAR(20) NOT NULL UNIQUE, 
   Email VARCHAR(75) UNIQUE,
   CompanyLogo BYTEA
 );

The email column is unique and it causes a "bug" in my scenario since there could only be one record with null. I'm trying to achieve records of companies without the same email but at the same time allow a companies to have no email.

How can I achieve that?

Erwin Brandstetter · Answer 1 · 2022-06-03T22:42:09.110

164

This is a misunderstanding.
The UNIQUE constraint does exactly what you want. NULL values can coexist in multiple rows in a column defined UNIQUE.

The manual:

In general, a unique constraint is violated when there is more than one row in the table where the values of all of the columns included in the constraint are equal. However, two null values are not considered equal in this comparison. That means even in the presence of a unique constraint it is possible to store duplicate rows that contain a null value in at least one of the constrained columns. This behavior can be changed by adding the clause NULLS NOT DISTINCT [...]

Bold emphasis mine.

Be aware that character types allow an empty string (''), which is not a NULL value and would trigger a unique violation just like any other non-null value when entered in more than one row.

Postgres 15 allows the clause NULLS NOT DISTINCT to alter this behavior (the opposite of what you requested). I updated the above quote. See:

Create unique constraint with null columns

edited Jun 03 '22 at 22:42

answered Nov 22 '13 at 21:08

Erwin Brandstetter

605,456
145
1,078
1,228

An excellent answer indeed. Although it's good to know, is it 100% safe? Also, it smells like bad design, having null and unique in the same column. Please correct me if I'm wrong. – foibs Nov 22 '13 at 21:22
5

@foibs: That is 100% correct and failsafe and conforming to the SQL standard and how it should be. `NULL` is defined that way for a reason. – Erwin Brandstetter Nov 22 '13 at 21:25
Nullable unique constraints are a really really bad idea. One of the ugly "features" of SQL. They are of course unnecessary and usually very easily avoided. – nvogel Nov 22 '13 at 21:45
3

@sqlvogel: To avoid the feature, define the column `NOT NULL`. Best of both worlds, nothing ugly about this. Useful as this very request goes to demonstrate. So, obviously, I don't quite agree with your verdict. – Erwin Brandstetter Nov 22 '13 at 21:49
Isn't that what I suggested? My criticism was directed at the idea of *allowing* nulls in a UNIQUE constraint. It is *not* useful to do so. Allowing nulls accomplishes nothing that cannot be accomplished by creating a new table with a non-nullable constraint. – nvogel Nov 22 '13 at 22:29
5

@sqlvogel: I am stressing the *usefulness* of the feature (while it can easily be switched off when not desirable). You write `It is not useful to do so` under a question where the OP requested exactly that, *proving* its usefulness? Is the irony lost on you? – Erwin Brandstetter Nov 22 '13 at 22:40
The OP's "problem" is caused by a faulty design (non-normal form). Easily solved without the nonsense of a nullable unique constraint. I already posted my answer to that effect. – nvogel Nov 22 '13 at 22:50
1

Faulty according to your set of rules. Not faulty according to the SQL standard or to my set of rules. Which normal form would you even see violated? Your suggested alternative offers more cost and complications for no gain. – Erwin Brandstetter Nov 22 '13 at 22:58
No normal form permits key dependencies to be satisfied by nullable columns. If you prefer without cause to sacrifice keys in the name of reducing the number of tables then where would you stop? Ad absurdum you might ultimately create one table with every column nullable. I'm suggesting that full normalization is a simpler alternative as a default design choice than an illogical and redundant feature of SQL that is so inconsistently implemented across different DBMSs and data management tools as to be effectively proprietary. The OP can make up his own mind which he prefers. – nvogel Nov 23 '13 at 02:32
2

I am all for normalization as long as it is half way sensible. But I see *no* "key dependencies" nor any violation of any normal form. You still didn't name one. Either way, this is purely academic. It would be unwise to create a separate table for the email attribute as long as there are no additional requirements like attributes per email address or multiple emails per company. The nullable column fits the requirements perfectly. I suggest we agree to disagree. – Erwin Brandstetter Nov 23 '13 at 03:32
3

@sqlvogel and ErwinBrandstetter Looks like you've fallen into the age-old debate about the worthiness or dangerousness of NULL. For an exhaustive critique of NULL read [Dr. Chris Date](http://en.wikipedia.org/wiki/Christopher_J._Date)'s book, [Guide to the SQL Standard](http://www.amazon.com/Guide-SQL-Standard-4th-Edition/dp/0201964260). You can also read academic articles explaining the multiple ambiguous meanings that NULL can take. While I'm in the camp that NULL is more trouble than help, other folks find practical use. I suggesting letting it go at that. – Basil Bourque Nov 23 '13 at 04:37
@Erwin, What Basil said. Email->CompanyUniqueID is the dependency that the OP says he wants to enforce therefore the suggestion to make Email a non-key (nullable) attribute of the MMCompany table would be in violation of 3NF (non-key dependency). Putting it in a new table restores that dependency. – nvogel Nov 23 '13 at 06:39

Basil Bourque · Answer 2 · 2018-07-30T17:17:46.020

No Such Problem In Postgres

In Erwin Brandstetter's correct answer, he explains that you should indeed be seeing the behavior you want (multiple NULLs allowed in a Unique constraint). You should see this behavior in Postgres in particular as well as any SQL standard compliant database in general.

Workaround for Other Databases

However, the Postgres doc cautions about portability because some databases are known to be in violation of this feature. For such a non-compliant system I suggest replacing the use of a NULL value in such fields with a bogus value. The bogus value would be a string such as "unknown_" plus some arbitrary value that is virtually certain to be unique. That arbitrary value could be something like the current date-time plus a random number.

UUID

But, rather than roll your own arbitrary value, generate a UUID. The original Version 1 UUID is indeed a combination of the current date-time, a random number, and the computer's virtually unique MAC address.

A UUID presented as a hex string with canonical formatting using hyphens looks like this:

93e6f268-5c2d-4c63-9d9c-40e6ac034f88

So my suggestion is to combine an arbitrary string such as "unknown_" plus a UUID, to look like this:

unknown_93e6f268-5c2d-4c63-9d9c-40e6ac034f88

So my suggestion for non-compliant databases is to generate such a value and use it in place of NULL, use it where you do not yet have a known value in that column for a particular row. Instead of writing queries that look for rows that have (or do not have) a NULL value in that column, write queries that look for rows that have (or do not have) a value beginning with the arbitrary string, "unknown_" in this example. Each row would then satisfy the constraint of having a unique value.

Indeed, I would assign this "unknown_" + UUID value as the default for that column.

You could also add a NOT NULL constraint to this column.

Generating UUID Values

Postgres has built-in support for the data type of UUID, but that's irrelevant in this answer here. What you need is to generate a UUID value.

For generating UUIDs you need an extension (plugin) that adds this capability to Postgres. Most Postgres installers include such an extension. This extension is called uuid-ossp. Usually the extension is not activated by default. To do so in recent versions of Postgres, use the CREATE EXTENSION command. For instructions, see my blog post on installing in Postgres 9.1 and later or my other post on Postgres 9.0 and earlier. Both the new and old way of installation is easy provided the extension/plugin was compiled and bundled with your Postgres installation.

Summary

Let me be clear that for Postgres alone, there is no need for this workaround because Postgres complies with the SQL standard. But if:

You are concerned about portability of your code to some other non-compliant database system, or
You need to exchange data with a non-compliant database system, or
You agree with Dr. Chris Date that NULL is the work of the devil and should be avoided

…then a workaround such as this is necessary.

Thank you, I've seen it already, I'd like to reassure you I did bothered to search before posting, I usually do, unfortunately the answer offered is not the right one for me, anyhow than you for your help, try not to be so rude next time! — liv a, Nov 22 '13 at 21:05
@liva (a) I was wrong about your lack of bothering to research. I was wrong in my original answer's link to a Question-Answer that failed to address your question. I stand corrected. For my rudeness and thoughtlessness, I apologize. (b) I replaced my original answer with a fresh one that may help you or anyone else who may need to work with databases other than Postgres, databases that fail to follow the SQL spec’s requirement that a Unique constraint should allow multiple NULL values. — Basil Bourque, Nov 23 '13 at 04:00
@KamilKiełczewski No, that won't do. There is no `row.id` in Postgres. Same idea found in the [`ctid` system column](https://www.postgresql.org/docs/current/static/ddl-system-columns.html) on every table, which uniquely identifies the row. But as a physical identifier, it can change such as during a *VACUUM FULL*, and possibly be re-assigned. As the doc says: “ctid is useless as a long-term row identifier”. When you need a universally unique identifier, use… a [*universally unique identifier (UUID)*](https://en.wikipedia.org/wiki/Universally_unique_identifier). — Basil Bourque, Apr 15 '18 at 04:16

score 12 · Answer 3 · edited May 23 '17 at 12:26

12

Some databases do not allow multiple null values, for example the SQL Server documentation states that "multiple null values are considered duplicates". On databases that do not allow nullable UNIQUE constraints you could try this (from GuidoG's answer to another question):

CREATE UNIQUE NONCLUSTERED INDEX IDX_Email
ON MMCompany (Email)
WHERE Email IS NOT NULL;

edited May 23 '17 at 12:26

Community

1
1

answered Mar 01 '16 at 16:26

dumbledad

16,305
23
120
273

score 5 · Answer 4 · answered Nov 22 '13 at 21:42

Drop the email column from the table. Put it in a new table where it can be NOT NULL and UNIQUE:

CREATE TABLE CompanyEmail
 (
    CompanyUniqueID INT NOT NULL PRIMARY KEY
       REFERENCES MMCompany (CompanyUniqueID),
    Email VARCHAR(75) NOT NULL UNIQUE
 );

Avoid nullable UNIQUE constraints.

Denis de Bernardy · Answer 5 · 2013-11-22T21:20:02.103

3

Unique and null don't get along much, since null is undefined by definition — you can't know if two nulls are the same unknown.

In this sense, your current unique constraint on email is the right thing to do and should work as is.

In case you ever need to make it otherwise, though, a partial index works:

create unique index on MMCompany((email is null)) where (email is null);

Another approach is to define a constraint trigger. Something like:

create function email_chk() returns trigger as $$
begin
  if exists (
    select 1 from mmcompany where email is null and companyuniqueid <> new.id
  ) then
    raise 'dup null found';
  end if;
  return null;
end;
$$ language plpgsql;

create constraint trigger after insert or update on mmcompany
for each row when (new.email is null)
execute procedure email_chk();

edited Nov 22 '13 at 21:20

answered Nov 22 '13 at 21:04

Denis de Bernardy

75,850
13
131
154

1

Actually, two nulls are always considered not equal. That is why you never do `WHERE email = NULL` but do `WHERE email IS NULL` instead. Sorry for replying to a 5year old comment. But that was true 5 years ago as much as it is now... – Vasil Svetoslavov Sep 30 '19 at 16:34
Well, "is null" means "null is a singleton" and any two null values should be equal. – Alex K Mar 17 '21 at 11:22

score -2 · Answer 6 · answered Oct 24 '17 at 18:28

And just in case you're generating your DB Tables using EF Code First, edit your Migration Class' Up method like the following to enforce your UNIQUE KEY constraint to ignore NULL.

migrationBuilder.Sql(@"CREATE UNIQUE NONCLUSTERED INDEX[IX_Employees_TaskId] ON[dbo].[Employees]([TaskId] ASC)
                                WHERE [TaskId] IS NOT NULL"
                                );

And then you could test your Unique Constraint by logging into your DB through SQL Server Management Studio or something similar. Like in this case Employee Table happily accepts 2 NULL values in TaskId although its an UNIQUE column.