240

I have a form on a website which has a lot of different fields. Some of the fields are optional while some are mandatory. In my DB I have a table which holds all these values, is it better practice to insert a NULL value or an empty string into the DB columns where the user didn't put any data?

dreftymac
  • 31,404
  • 26
  • 119
  • 182
roflwaffle
  • 29,590
  • 21
  • 71
  • 94

6 Answers6

227

By using NULL you can distinguish between "put no data" and "put empty data".

Some more differences:

  • A LENGTH of NULL is NULL, a LENGTH of an empty string is 0.

  • NULLs are sorted before the empty strings.

  • COUNT(message) will count empty strings but not NULLs

  • You can search for an empty string using a bound variable but not for a NULL. This query:

    SELECT  *
    FROM    mytable 
    WHERE   mytext = ?
    

    will never match a NULL in mytext, whatever value you pass from the client. To match NULLs, you'll have to use other query:

    SELECT  *
    FROM    mytable 
    WHERE   mytext IS NULL
    
Quassnoi
  • 413,100
  • 91
  • 616
  • 614
  • 6
    but which one you think is faster? 0 or NULL or "" – Atul Dravid Oct 16 '10 at 23:48
  • 10
    in InnoDB NULL's take up less space – Timo Huovinen Jul 25 '13 at 10:41
  • 43
    I think this is an okay answer, but it also totally ignores the "best practice" element of the question and only focuses on tangentially relevant facts (NULL sort order and length? These don't matter). On most text data input types there **is not** a difference between "no response" and "empty response", so I think this is a great question that deserves a better answer. – Nick Aug 11 '14 at 22:41
  • 6
    NULLs also work great when UNIQUE field is set. For example if you have a field like Driving License to add person’s DL No. and the guy doesn’t have it. Since it’s a unique field the first person with no DL No. will get added but not the next as it will throw an error of unique constraint. So, NULL is better. – Saifur Rahman Mohsin Feb 22 '16 at 10:29
  • @SaifurRahmanMohsin: it's a textbook example of bad practice to set driving license number as unique. – Quassnoi Feb 22 '16 at 17:10
  • I just came up with it as an example. You can replace it with anything that requires unique. Like say IMEI code of mobile equipment or google place ID for places. – Saifur Rahman Mohsin Feb 22 '16 at 17:12
  • @Quassnoi why that? – cedbeu Jun 03 '16 at 11:05
  • @cedbeu: why what? – Quassnoi Jun 03 '16 at 11:19
  • 1
    @Quassnoi ah sorry... I meant, why is it a bad practice to set driving number licence as unique...? – cedbeu Jun 03 '16 at 11:20
  • 1
    @cedbeu: because it's not unique. And even if it is unique today, tomorrow your legislation body can make it not unique. – Quassnoi Jun 03 '16 at 11:48
  • There is a performance surplus to using NULL. If you have to add an extra WHERE clause to skip '' empty elements, of course it is slower. – twicejr Dec 06 '17 at 10:09
46

One thing to consider, if you ever plan on switching databases, is that Oracle does not support empty strings. They are converted to NULL automatically and you can't query for them using clauses like WHERE somefield = '' .

Matt Solnit
  • 32,152
  • 8
  • 53
  • 57
  • 12
    This sounded incredibly fishy to me, even on your link, so I tried it. Null field, set to '', oracle ignores it. Reports length as null rather than 0. That's just so wrong. There's got to be some way around this. Think I'll post this as another question. – Steve B. Aug 12 '09 at 19:10
  • 1
    `Steve B.`: see this question: http://stackoverflow.com/questions/1171196/difference-between-varchar-and-varchar2 – Quassnoi Aug 12 '09 at 19:16
  • Thanks for the reference, though I still don't understand the reasoning. Posted as http://stackoverflow.com/questions/1268177/oracle-not-distinguishing-between-nulls-and-empty-strings – Steve B. Aug 12 '09 at 19:24
  • Might be worth updating the answer to include info from the link posted by Quassnoi – SamuelKDavis Mar 05 '14 at 20:48
  • 1
    @Steve B: that was a design decision made by Oracle a _l_o_n_g_ time ago. They can't change that behavior without breaking a whole boatload of legacy Oracle applications that are designed around this behavior. So, Oracle is forever stuck with this design decision (that zero length strings are equivalent to NULL.) – spencer7593 May 06 '14 at 18:50
  • 8
    Peoplesoft (with Oracle DB) uses a single space to indicate an empty value. Incredibly stupid. They also use 0.00025 to indicate 0 for FTE since 0 is not allowed. Lovely choices were made in that product. – JP Duffy May 13 '16 at 13:31
10

One thing to keep in mind is that NULL might make your codepaths much more difficult. In Python for example most database adapters / ORMs map NULL to None.

So things like:

print "Hello, %(title)s %(firstname) %(lastname)!" % databaserow

might result in "Hello, None Joe Doe!" To avoid it you need something like this code:

if databaserow.title:
    print "Hello, %(title)s %(firstname) %(lastname)!" % databaserow
else:
    print "Hello, %(firstname) %(lastname)!" % databaserow

Which can make things much more complex.

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
max
  • 29,122
  • 12
  • 52
  • 79
  • 25
    In my opinion abusing your database to "fix" errors in your code or the framework is a (very) bad coding practice. When there is no data you should just insert NULL and be consistent in using that. Otherwise you must use statements like: if(myString == null || myString = ""). When an object is not set or defined in your code you are also using NULL instead of some kind of "placeholder" (which an empty string is in my opinion). – Gertjan Aug 12 '09 at 19:39
  • 5
    Depends very much on your Language of choice. In Python "if not myString:" tests for None and "". Probably mainly a cultural issues. The Java Guys "bad practice" is the dynamic person's elegance. – max Oct 21 '10 at 08:54
10

Better to Insert NULL for consistency in your database in MySQL. Foreign keys can be stored as NULL but NOT as empty strings.

You will have issues with an empty string in the constraints. You may have to insert a fake record with a unique empty string to satisfy a Foreign Key constraint. Bad practice I guess.

See also: Can a foreign key be NULL and/or duplicate?

Community
  • 1
  • 1
Mike Casan Ballester
  • 1,690
  • 19
  • 33
  • The constraints issue has tripped me up in the past so that is why I "+1" this answer. – HPWD May 10 '18 at 01:08
  • But if you use NULL make very sure you never end up with any empty strings as well. Easy to do with many UI technologies. – Tuntable Nov 22 '19 at 09:51
6

I don't know what best practice would be here, but I would generally err in favor of the null unless you want null to mean something different from empty-string, and the user's input matches your empty-string definition.

Note that I'm saying YOU need to define how you want them to be different. Sometimes it makes sense to have them different, sometimes it doesn't. If not, just pick one and stick with it. Like I said, I tend to favor the NULL most of the time.

Oh, and bear in mind that if the column is null, the record is less likely to appear in practically any query that selects (has a where clause, in SQL terms) based off of that column, unless the selection is for a null column of course.

Platinum Azure
  • 45,269
  • 12
  • 110
  • 134
  • 1
    ...And now that I see the answer above me, I think it's safe to say that the usual differentiation you would care about is no data versus empty data. :-) – Platinum Azure Aug 12 '09 at 18:54
2

If you are using multiple columns in a unique index and at least one of these columns are mandatory (i.e. a required form field), if you set the other columns in the index to NULL you may end up with duplicated rows. That's because NULL values are ignored in unique columns. In this case, use empty strings in the other columns of the unique index to avoid duplicated rows.

COLUMNS IN A UNIQUE INDEX:
(event_type_id, event_title, date, location, url)

EXAMPLE 1:
(1, 'BBQ', '2018-07-27', null, null)
(1, 'BBQ', '2018-07-27', null, null) // allowed and duplicated.

EXAMPLE 2:
(1, 'BBQ', '2018-07-27', '', '')
(1, 'BBQ', '2018-07-27', '', '') // NOT allowed as it's duplicated.

Here are some codes:

CREATE TABLE `test` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `event_id` int(11) DEFAULT NULL,
  `event_title` varchar(50) DEFAULT NULL,
  `date` date DEFAULT NULL,
  `location` varchar(50) DEFAULT NULL,
  `url` varchar(200) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `event_id` (`event_id`,`event_title`,`date`,`location`,`url`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

Now insert this to see it will allow the duplicated rows:

INSERT INTO `test` (`id`, `event_id`, `event_title`, `date`, `location`, 
`url`) VALUES (NULL, '1', 'BBQ', '2018-07-27', NULL, NULL);

INSERT INTO `test` (`id`, `event_id`, `event_title`, `date`, `location`, 
`url`) VALUES (NULL, '1', 'BBQ', '2018-07-27', NULL, NULL);

Now insert this and check that it's not allowed:

INSERT INTO `test` (`id`, `event_id`, `event_title`, `date`, `location`, 
`url`) VALUES (NULL, '1', 'BBQ', '2018-07-28', '', '');

INSERT INTO `test` (`id`, `event_id`, `event_title`, `date`, `location`, 
`url`) VALUES (NULL, '1', 'BBQ', '2018-07-28', '', '');

So, there is no right or wrong here. It's up to you decide what works best with your business rules.