34

Why is negative id or zero considered a bad practice when inserting a primary key in a database table?

I think it could be useful in some cases, but people say that it is not recommended, despite the fact that they never say/know why.

So, I was wondering if is there, by definition, some restriction or if it shouldn't have any problem or if is it just a convention and if really there is some restriction about that, why isn't that feature blocked?

falsarella
  • 12,217
  • 9
  • 69
  • 115
  • Do you have anything to document this "bad practice"? I'd imagine negative numbers are more unpleasant to read, and slightly harder to type. I really can't see any technical reason to avoid them. In fact it's a great way to extend the range of a signed data type. – Yuck Jan 19 '12 at 14:31
  • 1
    On the side, this is useful when creating records on the client side that later needs to be inserted on the server: http://msdn.microsoft.com/en-us/library/ms971502.aspx – Niklas Jan 19 '12 at 14:36
  • I guess it is popular that negative numbers are 'considered' a 'bad practice', something 'ugly' to 'avoid'. Isn't it? But I really can't see why... Then, @Yuck, you think it's because of readability? – falsarella Jan 19 '12 at 14:40
  • 2
    Some discussion at dba stackexchange that may give some perspective. http://dba.stackexchange.com/questions/2895/what-are-negative-keys-used-for I'm not really sure of a good reason why not to use them, however. – RHamblin Jan 19 '12 at 14:42
  • Found something about .NET (it's not what I use, but others might use that information): http://stackoverflow.com/q/2473215/1064325 – falsarella Jan 23 '12 at 10:16

3 Answers3

37

To be clear, this question and answer are about using negative numbers for surrogate keys, not for natural keys.

As far as I know, there are three reasons for considering it to be a bad practice.

  1. It violates the principle of least surprise.
  2. Some people assume all ID numbers are non-negative.
  3. Some people use negative numbers to indicate errors.

The first one has some validity to it. You never see SQL examples or answers on SO that use negative ID numbers. (I'm going to change that, starting today.)

The second and third are corollaries to the first, in that programmers often assume surprise-free behavior. (That reminds me of discovering that VBA would let me multiply two dates, returning a number that would be expressed, I guess, in square dates.)

For number 2, application programmers might introduce subtle errors by not allowing room for the sign in UI code, which might make -123456 look like 123456.

The third has to do with writing code that returns id numbers. Code that returns a single id number might return -1 as an error code. But -1 is a valid ID number in most cases. (Most databases don't restrict id numbers to the range of non-negative integers.)

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
  • You consider these reasons the same for zero, or is there something different? – falsarella Jan 26 '12 at 10:44
  • 2
    Same for zero, I think. Some places use 0 as a kind of a non-null replacement for NULL. – Mike Sherrill 'Cat Recall' Jan 26 '12 at 11:43
  • 1
    No restriction then, thanks! The problem will be the implementation code. But... What about using -1 for admin users or some constants like this? How do you handle? – falsarella Jan 27 '12 at 12:59
  • In this context, id numbers should identify a person, and that's all. If you also want to know whether a person is an admin, store their id number in an "admins" table. – Mike Sherrill 'Cat Recall' Jan 27 '12 at 15:50
  • Sometimes it is comfortable to use zero-id record in a table to have other tables reference it as a default value. It simplifies controlling of error situations when a non-existent record from other table is referenced. Especially when foreign keys are not strictly checked, like in MySQL MyISAM engine. With InnoDb engine - also prevents from query blocking. – Zon Sep 09 '14 at 04:59
  • so it is a good idea to use negative id to indicate errors in an a reference table, for example `id: -1, message: error` `id: 1, message: OK`? – wdetac Feb 05 '18 at 04:55
  • 1
    @wdetac: The OP's question and my answer are about surrogate keys. I'll edit my answer to make that clearer. By definition, surrogate keys are meaningless. But using negative numbers to mean errors in a table of status codes is a completely different use case. In *that* context, negative numbers are natural keys, not surrogate keys; they aren't at all surprising. – Mike Sherrill 'Cat Recall' Feb 05 '18 at 11:54
12

The answer by @Mike Sherrill 'Cat Recall' is IMHO incorrect.

Negatives: The reason for not using negatives for IDs is that negative numbers are not portable. The binary representation of a decimal value depends on the underlying numerical architecture, and this effects the way a negative decimal value will be presented in non-negative, streamed format (e.g., hex, base36, etc.). Similarly one doesn't use floating point values as identifiers, even though within the constraints of a single architecture it is theoretically possible.

Zero: Zero can serve as an ID. It is not recommended though because it often denotes an empty field / NULL value.

avnr
  • 614
  • 5
  • 12
  • Nice point. +1. Do you have any sources about it? Never thought of representation architecture, but it would still have a portability problem for the rest of the field types, anyway... Hence, the principle of least surprise still seems a more relevant reason. – falsarella Dec 29 '15 at 18:27
  • 2
    Formally, an identifier is a token (there is a good overview in https://en.wikipedia.org/wiki/Identifier), so a negative number could be considered a token only if you consider it as a sequence of characters, which kind of defies its nature as "negative". So in the context of your question, where the ID is a *number* and not a string, this number *should not* serve as a ID when negative. This isn't just a recommendation as in the case "principle of least surprise", it is, IMHO a mistake to use a negative ID. Perhaps the more extreme example of using floats as IDs is more intuitive. – avnr Dec 29 '15 at 18:41
  • 2
    I think that there are several ways to backup and port a database. Binary backup/portability is an option, but we're not restricted to it. The real mistake would be relying on physical architecture to backup or port something. As I've mentioned, in case it happened, the portability would still be a problem for any other unportable columns. So resolving it for the PK actually wouldn't resolve anything. Your statement doesn't apply just for PKs, it applies to any number column, so it would be a mistake to use it anywhere if a negative is allowed, which is really non-sense. – falsarella Dec 29 '15 at 19:09
  • There is no real issue in backup and porting because in these cases the encoding is well under control. The problem begins when you use the ID in an interface. It is a very common schema, for example, for IDs to have a fixed width in characters in some base, as a way to validate the value as an ID. But you just can't do that with negative numbers unless the schema specifies the encoding for the negative numbers: LSB or. MSB, ones complement or twos complement, etc., which isn't practical. This isn't the same as a data field because a data field is just number, it is not a token. – avnr Dec 29 '15 at 19:24
  • Maybe this will be simpler: Lets say we have an item with ID "-10". If you interpret the ID as three bytes consisting of the ascii representation of hyphen-one-zero, then no problem, this is a valid ID. If however you want this ID to be treated as a number (as this point is at the core of your question), then its tokenization is non-ambiguous only as a signed decimal. You cannot tokenize it unambiguously in any other common representation unless you specify a specific number architecture, without losing the meta of its numerical value. – avnr Dec 29 '15 at 19:36
  • Considering that I'm just talking about using a long number as a primary key in some database, binary tokenization in that context seems very overzealous. In any case, thank you by your contribution! – falsarella Dec 29 '15 at 20:50
  • I believe that binary tokenization is a standard in IDs, e.g. OID, UUID, etc. I don't quite see why one would keep in a database an ID that cannot be *used* as an interoperable ID. After all with *n* bits you can represent 2^n values, no matter if your long is signed or unsigned. So why even consider using a signed integer for an ID? – avnr Dec 30 '15 at 13:32
-2

There are more than 51 million sites discussing this problem.

I agree with @Mike Sherrill and it is likely common that NULLs/Empty fields or Negative Ids create severe problems in determining the true values. It serves no informational purpose at all, and can only lead to incorrect answers and distrust in the database itself.

Allowing Zero values,Negative values into your columns introduces a whole new degree of uncertainty into your database. guesses must be made by the SQL programmer to counter for erroneous results of NULL values in a database.

vrn53593
  • 308
  • 1
  • 5