4

I read a 45-tips-database-performance-tips-for-developers document from a famous commercial vendor for SQL tools today and there was one tip that confuse me:

If possible, avoid NULL values in your database. If not, use the appropriate IS NULL and IS NOT NULL code.

I like having NULL values because to me it is a difference if a value was never set or it 0 or string empty. So databases have this for a porpuse.

So is this tip nonsense or should I take action to prevent having NULL values at all in my database tables? Does it effect performance a lot have a NULL value instead of a filled number or string value?

YvesR
  • 5,922
  • 6
  • 43
  • 70
  • 2
    Use NULL whenever appropriate. Never ever consider an 'alternative' to NULLs. My 2c. – Remus Rusanu Feb 14 '14 at 11:11
  • 2
    One guess would be that the advice started off along the lines of "mark all columns as `NOT NULL` unless you've really thought hard about it and want to accept `NULL`s in that column" – Damien_The_Unbeliever Feb 14 '14 at 11:26
  • 1
    it also speaks to normalization.. the lack of a value could be pretty directly tied to the lack of a row. – Randy Feb 14 '14 at 15:35
  • Perhaps this should be merged with https://stackoverflow.com/questions/4336687/how-can-i-avoid-nulls-in-my-database-while-also-representing-missing-data ? – ocharles Jun 11 '15 at 23:34

5 Answers5

9

Besides the reasons mentioned in other answers, we can look at NULLs from a different angle.

Regarding duplicate rows, Codd said

If something is true, saying it twice doesn’t make it any more true.

Similarly, you can say

If something is not known, saying it is unknown doesn't make it known.

Databases are used to record facts. The facts (truths) serve as axioms from which we can deduce other facts.
From this perspective, unknown things should not be recorded - they are not useful facts.
Anyway, anything that is not recorded is unknown. So why bother recording them?
Let alone their existence makes the deduction complicated.

dzhu
  • 793
  • 1
  • 9
  • 17
5

The NULL question is not simple... Every professional has a personal opinion about it.

Relational theory Two-Valued Logic (2VL: TRUE and FALSE) rejects NULL, and Chris Date is one of the most enemies of NULLs. But Ted Codd, instead, accepted Three-Valued Logic too (TRUE, FALSE and UNKNOWN).

Just a few things to note for Oracle:

  1. Single column B*Tree Indexes don't contain NULL entries. So the Optimizer can't use an Index if you code "WHERE XXX IS NULL".

  2. Oracle considers a NULL the same as an empty string, so:

    WHERE SOME_FIELD = NULL
    

    is the same as:

    WHERE SOME_FIELD = ''
    

Moreover, with NULLs you must pay attention in your queries, because every compare with NULL returns NULL. And, sometimes, NULLs are insidious. Think for a moment to a WHERE condition like the following:

WHERE SOME_FIELD NOT IN (SELECT C FROM SOME_TABLE)

If the subquery returns one or more NULLs, you get the empty recordset!

These are the very first few cases that I want to talk about. But we can speak about NULLs for a lot of time...

Corrado Piola
  • 859
  • 1
  • 14
  • 18
  • 2
    Codd did not accept 3VL, as the detractors say he did. By the evidence, (a) he accepted that UNKNOWN may have to be recorded, and (b) he gave a method, which is not 3VL, it is 2VL. Codd's Relational Theory rejects NULL in the data, but NULL can appear in ResultSets, so there is no getting around the fact that code must handle NULL, regardless of whether it is in the data or not. – PerformanceDBA Jun 07 '15 at 01:13
  • Codd certainly did propose 3VLs, with UNKNOWN. Standard SQL allows certain use of UNKNOWN & allows NULL in its place. – philipxy Nov 20 '19 at 23:53
3

It's usually good practice to avoid or minimise the use of nulls. Nulls cause some queries to return results that are "incorrect" (i.e. the results won't correspond with the intended meaning of the database). Unfortunately SQL and SQL-style databases can make nulls difficult, though not necessarily impossible, to avoid. It's a very real problem and even experts often have trouble spotting flaws in query logic caused by nulls.

Since there is nothing like nulls in the real world, using them means making some compromises in the way your database represents reality. In fact there is no single consistent "meaning" of nulls and little general agreement on what they are for. In practice, nulls get used to represent all sorts of different situations. If you do use them it's a good idea to document exactly what a null means for any given attribute.

Here's an excellent lecture about the "null problem" by Chris Date:

http://www.youtube.com/watch?v=kU-MXf2TsPE

nvogel
  • 24,981
  • 1
  • 44
  • 82
2

There are various downsides to NULLs that can make using them more difficult than actual values. for example:

  1. In some cases they are not indexed.
  2. They make join syntax more difficult.
  3. They need special treatment for comparisons.

For string columns it might be appropriate to use "N/A", or "N/K" as a special value that helps distinguish between different classes of what could otherwise be NULL, but that's tricky to do for numerics or dates -- special values are generally tricky to use, and it may be better to add an extra column (eg. for date_of_birth you might have a column that specifies "reason_for_no_date_of_birth", which can help the application be more useful.

For many cases where data values are genuinely unknown or not relevant they can be entirely appropriate of course -- date_of_death is a good example, or date_of_account_termination.

Sometimes even these examples can be rendered irrelevant by normalising events out to a different table, so you have a table for "ACCOUNT_DATES" with DATE_TYPES of "Open", "Close", etc.

David Aldridge
  • 51,479
  • 8
  • 68
  • 96
1

I think using NULL values in the database is feasible until your application has a proper logic to handle it, but according to this post there may be some problems as discussed here http://databases.aspfaq.com/general/why-should-i-avoid-nulls-in-my-database.html

51k
  • 1,381
  • 3
  • 12
  • 22