4

I have a nullable winnerID attribute which is constantly updated according to the bidding of an auction.

This attribute is set to save sometime in the end to calculate who won the auction. But this value is null initially.

Will this violate Third Normal Form? I knew Rule 4 of 1NF states that I cannot have nullable attribute, but the note says this is a controversial statement.

May you please clarify this for me? Thank you!

donkey
  • 4,285
  • 7
  • 42
  • 68
  • Does this answer your question? [What to do with null values when modeling and normalizing?](https://stackoverflow.com/questions/40733477/what-to-do-with-null-values-when-modeling-and-normalizing) – philipxy Jan 12 '22 at 05:55

1 Answers1

4

The way nulls are defined in SQL - as non-values which can't be compared with domain values - means they violate 1NF (and all higher normals forms). A relation (the mathematical structure represented by normalized tables) must have a single value for every column for every row. Nulls mean we have no value, and a nullable column means we have two relations in one table - a supertype relation which includes all the columns except the nullable one, and a subtype relation which has the same primary key and the previously nullable column, for which we can record only rows for which the attribute is known. The purpose of normalization is to factor a data set into elementary facts without losing information, so having two relations in a single table contradicts the objective and makes things like relational algebra/calculus more complicated.

The normal forms are formal logically defined structures, not industrial best practices which can be adapted to the situation, so I don't see much space for controversy. Whether we should use them and how they should be handled is a more interesting topic.

While nulls violate the normal forms, it doesn't mean you can't use nulls in your SQL database. They have risks as well as benefits. I use them too, but with consideration.

reaanb
  • 9,806
  • 2
  • 23
  • 37
  • 1
    When people say null is not a value, that's nonsense; it's just a value that is treated specially by the operators. Ie when they say SQL JOIN is join, that's also nonsense, because it treats null specially. When they say that null "means" something like "there is no value for there" or "no value is known for there", that's also nonsense, because what it means for null to appear in a row depends on a table's *predicate* (membership function for deciding if a row of values belongs in a table). There *are* uses of null by a predicate that can reasonably described so *in that table*. ... – philipxy Feb 01 '17 at 08:43
  • ... But to say it means that period, is nonsense, because for null-free tables and 3VL-free operators the output's predicate is a certain logical combination of the input predicate(s), which we rely on to to understand what rows a query asks for, but under nulls & 3VL that's not so. In *particular queries* for *particular inputs* it can be so. (I haven't explained *how we ought to normalize* under nulls & 3VL. I'm just explaining that certain writing about null is nonsense, hence unhelpful, and reasoning/explanation/justification based on it is nonsense, and unhelpful.) – philipxy Feb 01 '17 at 08:50
  • 1
    I find it simpler to think of nulls as metadata. To be a value means to be a member of a set over which certain operators are defined. The handling of nulls in operators like equality dramatically complicates logical models. Treating nulls as physical model artifacts rather than logical ones makes them easier to understand and work with. – reaanb Feb 01 '17 at 10:27
  • I would agree that one *could* take a "metadata" and/or "physical" approach, though definitely not that it would be simpler. I could accept "it's not a relation so all bets are off". But besides *not* giving a clear answer in "metatdata" or "physical" terms this answer would be much better if it didn't repeat a lot of culturally-accepted received-wisdom non-standard vagueness. PS The way nulls are defined in the SQL standard is in terms of a "special value", occurring in row & table values, all operated on by operators, when not in terms of non-standard vagueness in the style of this answer. – philipxy Feb 01 '17 at 11:19
  • Please provide an alternate answer so that we may all be educated. I'll even upvote it if it's comprehensible to average programmers. – reaanb Feb 01 '17 at 11:41
  • You recently called a question (http://stackoverflow.com/q/41966046/3404097, "Linked" above) a duplicate of the above with answer by you. (That led me here.) I called it a duplicate of http://stackoverflow.com/questions/40733477/what-to-do-with-null-values, with answer by me. I'm not altogether happy with it. Comments welcome! – philipxy Feb 01 '17 at 11:55
  • That's one of your more accessible answers, and I feel that we agree in spirit, if not in the exact way we express it. We both agree that nulls complicate queries, suggest splitting the table in order to distinctly represent two different relations, and advise that avoiding nulls isn't the top priority. – reaanb Feb 01 '17 at 12:32