9

Database schema

I have this fields:

  • title (string)
  • subtitle
  • description (string)

Is better set the default value as an empty string '' or a NULL?

For better reading/writing and size-storage performance

sparkle
  • 7,530
  • 22
  • 69
  • 131
  • Depends on what your client code expects, I'd think? – hd1 Nov 24 '12 at 14:19
  • No matter. I will adapt client code to this – sparkle Nov 24 '12 at 14:26
  • I think you should ask yourself what is valid date in your database and how you wan to query it. For instance if for description an empty description is valid I would use an empty string. In general as said before NULL is used to indicate unknown data but if an empty string is a valid known data you should use an empty string. I use NULLs only if I must. – Sebastian Cabot Nov 24 '12 at 18:11

8 Answers8

21

The usual contract is:

  • NULL means "no information available".
  • '' means "there is information available. It's just empty."

Beyond this point there is much philosophical discussion since the invention of NULL in any language, not just SQL.

The only technical point here is: In PostgreSQL NULL can be stored more efficiently than a string of length zero. If that really matters in your case ... we cannot know.

A.H.
  • 63,967
  • 15
  • 92
  • 126
10

Expert Opinion: Avoid NULLs

Dr. Chris (C.J.) Date, who partnered with the creator/discoverer of the Relational Data Model (Dr. Codd) clearly says, "No, you should not use NULLs.". Read his book A Guide To The SQL Standard for much discussion.

He, along with other experts, argue that for a variety of both theoretical and practical issues, NULLs bring too much risk, confusion, and problems to make them worth while.

Solution:

  • Add a constraint on every column of NOT NULL.
  • Where it makes sense, add a default value to each column, DEFAULT. For a text type column, the default value may be an empty string (''). Or the default may be a certain wording you arbitrarily choose and use consistently such as 'EMPTY' or 'NOT-AVAILABLE' depending on your needs. On some columns you may not want a default, meaning you want to reject the record's insertion or update if the user/app did not provide a value.

Exception To The Rule

The above is a good rule, and I follow it habitually. However, every rule has exceptions. In very rare cases, I've made exceptions with allowing NULLs.

Example Exception: In Postgres when using the XML data type in table of millions of rows. I need to repeatedly search for rows with no value yet recorded (missing values). I cannot store an empty string as Postgres enforces a rule that only valid XML be stored, and an empty string is not a valid XML document. So I permit NULLs in that XML column.

You might think I could store a minimal XML document containing no data. But I do not know how to index on that efficiently, to differentiate between rows with recorded data and those without. I can create an index on NULL.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
4

Most is already said about this, but I think there is one more thing to consider if you finally think it's 50/50 between using NULL or "" as "no value" (simply spoken).

In MySQL value NULL is not "caught" if you have a negative condition on a column. Eg.

where column != 'text'

will only return rows where "column" doesn't have the value "text" but won't find rows where "column" is NULL, if you want to find also these rows you must use:

where column != 'text' OR column IS NULL

I myself still prefer using NULL and changing empty strings to nil before save, I think it's better to know that "empty value" always is NULL in the database.

On the other hand, there can be some situations where you want to use the difference between "no value" (NULL) and "empty value" (""). But I have never had that situation in some application - yet.

244an
  • 1,579
  • 11
  • 15
2

You should always use NULL to denote that the column doesn't have a value as even an empty string is a value.

Antony
  • 87
  • 3
  • There is extensive debate on this. There is quite a large school that think that NULLs in a database are an indication to poor database design. I myself try to avoid NULLs when ever possible although I'm not fanatic about it – Sebastian Cabot Nov 24 '12 at 14:37
  • @SebastianCabot: how can a NULL be an indication of a bad design? There are valid situation where the data is simply unknown and in that case the only valid choice is to store `NULL` –  Nov 24 '12 at 14:38
  • @a_horse_with_no_name As I said there is a debate on this. I recommend reading "SQL and Relational Theory" by C.J Date. In a nutshell. Using NULLs to indicate no data means you designed a database that can contain no data. From the perspective of relational theory each possible value will be connected to the row using a foreign key - In this way you don't really have tables but values belonging to an entity. The row will be constructed using a join query so if there is no data no value will be stored in the database. NULLs exist because it's hard to implement this scheme efficiently. – Sebastian Cabot Nov 24 '12 at 15:03
  • @SebastianCabot: There may be academic debate but I haven't seen any practical debate that doesn't boil down to "I don't like NULL because I don't understand it" versus "learn how NULL works and use it get the job done". SQL and relational theory are related but different things, more so when day-to-day practical considerations are involved. In a practical sense, the choice tends to boil down to using NULL or sentinel values and sentinels are pure evil. AFAIK, Date washed his hands of the NULL strangeness. – mu is too short Nov 24 '12 at 16:36
  • @muistooshort That is why I wrote I'm not fanatic about it. However I do try to see if it's absolutely necessary to use NULLs before actually designing a table that accepts them. As always be it in coding or databases we sacrifice good design for performance. But you better be damn sure you know what you are doing and why. And in the case of NULL don't just use it because everyone else does or because you can. – Sebastian Cabot Nov 24 '12 at 17:42
  • @Sebastian, for more detail, look up the difference between 5NF and 6NF. I'm not of that same school as you. There are plenty of situations where I use NULL to indicate that the field does not pertain to the given by the row, or that no data was available at insert time. – Walter Mitty Nov 24 '12 at 18:41
  • 1
    @mu, One of Codd's rules for relational database systems is that there should be a "sytematic treatment of missing values". I'm with you. a "systematic treatment" doesn't mean avoid missing data at all costs. The real world can sometimes be so inconvenient. I will admit, however that the SQL Boolean value UNKNOWN sometimes trips me up, because I grew up with two valued logic. And NULL=NULL yeilds UNKNOWN. – Walter Mitty Nov 24 '12 at 18:45
  • @SebastianCabot: I'll agree that NULLable columns in a table smells a bit off and something that always requires at least a quick "are you sure?" question. I don't like cargo cult programming either (and I spend a lot of time in Rails which is rife with it). Anyway, Date washed his hands of NULL but, alas, I don't have my Date book anymore so I can't lookup the footnote I have in mind. – mu is too short Nov 25 '12 at 01:47
  • 2
    @SebastianCabot "Using NULLs to indicate no data means you designed a database that can contain no data." is a reductio ad absurdum. It is easy to create a database which doesn't allow NULLs that also contains no data. Anybody who has done a data cleansing exercise will be familiar with columns containing `' '` or `'.'`: the user had no value for that field but the app insisted that *something* be entered. Frankly I'd rather handle a NULL than a magic vlue or garbage. – APC Nov 25 '12 at 11:27
2

You need to decide whether values of 'null' and empty string mean in your application something different, or maybe both of them just mean 'no data'. If the latter is the case, then it is generally just a matter of preference, but you'd have to be consequent - try not to mix 'null' values and empty ones in given field.

Generally 'null' gives better notion of "no data", but compared to empty string it is somewhat more cumbersome to use in an application. But then using empty strings instead of nulls may be seen as premature optimalization, and will make it impossible at some time in the future to introduce functionality which needs to distinguish between nulls and empty strings.

On the other hand there are DBMSs which do not store nulls in string columns, just empty strings. I'd go with nulls, but with well established and documented contract (i.e. "This fields never contains null, empty title means no title", enforced with NOT NULL constraint on column) which is followed consistently you could take whichever approach you like.

If you are concerned with performance, you'd need to read documentation of DBMS you are using and make some tests on your own. If you expect that empty values are very frequent, you could check, whether 'sparse columns' would be of any help - some DBMSs introduce these as means of efficient storage of frequently occurring null values, but they usually have some drawbacks, like general (usually not big) performance penalty on retrieving non-null value, or something like this.

And, of course, you have to take into account what clients expect. But when you create database which is not yet accessed by clients, it's up to you to decide and document it appropriately.

APC
  • 144,005
  • 19
  • 170
  • 281
Maciek
  • 1,696
  • 12
  • 19
1

This is not strictly applicable to your case, but I'll mention it for completeness: NULL foreign keys are not enforced.

If you had a field foreign_id that is a foreign key referencing some other table, it would be enforced only when foreign_id contains a non-NULL value.

BTW, Oracle stores empty string as NULL. VARCHAR2 is guaranteed to continue to behave that way, while VARCHAR might (one day) be changed to conform to the SQL standard and make the distinction between empty string and NULL. Other DBMSes (that I know of) do make that distinction.

Community
  • 1
  • 1
Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • thanks for pointing that out. That means on postgresql, you cannot use empty string ('') in a column that is used in a Foreign Key. see for example : `create table tab2(col1 varchar(10));` `alter table tab2 add constraint tab2_pk primary key(col1);` `create table tab1(colA varchar(10), col1 varchar(10));` `alter table tab1 add constraint fk_tab1_tab2 foreign key (col1) references tab2(col1);` `insert into tab1 values ('A','');` `ERROR: insert or update on table "tab1" violates foreign key constraint "fk_tab1_tab2"` `insert into tab1 values ('A',null);` `INSERT 0 1 (OK)` – R. Du Jan 10 '20 at 15:23
0

Use Null values only when the data is not known or not applicable..In all other cases use ""(empty value) as a special consideration is needed while writing queries for data involving NULL values which is often difficult..

stallion
  • 1,901
  • 9
  • 33
  • 52
0

It depends. Do you know the value is empty? Example: Person is known to have no middle initial.

Or do you just plain not know? Example: you received a form where the field "Middle Initial" has been left blank.

Walter Mitty
  • 18,205
  • 2
  • 28
  • 58