First Normal Form Ambiguity?

Question

In the attached image, 1NF for the data is given as

Pet [ pet_id, pet_name, pet_type, pet_age, owner ]
Pet_Visit [ pet_id, visitdate, procedure_no, procedure_name ]

Where the table name precedes a list of fields with primary keys in bold.

However, I wonder if

Pet [ pet_id, pet_name, pet_type, pet_age, owner, visitdate, procedure_no, procedure_name ]

would also qualify, since it seems to meet the requirements of "Each record should be uniquely identified by a key" and "each cell can contain only one item"? Or perhaps adding a visit_id field but still keeping one table, as in

Pet [ pet_id, pet_name, pet_type, pet_age, owner, visit_id, visitdate, procedure_no, procedure_name ] ?

More generally, does 1NF usually require splitting a table with "repeating groups" into more than one table, or can we technically keep it all in one table if a composite key can be found/created?

I've seen many examples with apparently varying definitions. I'm not looking for advice on how I should approach learning this topic, just answers to the questions I have asked. Thanks in advance.

You can do either — most people don't worry too much about 1NF (or 2NF). The fun comes with 3NF, BCNF, 4NF and 5NF (and 6NF, and DKNF, and …). — Jonathan Leffler, Oct 01 '19 at 20:21
These questions--what are 1NF & normalization to 1NF--are duplicates many, many times over. Moreover there is no consensus, no 1 answer. Moreover most popular is a specturm of confused, fuzzy, parroted messes. This is explained in my answer & its links at the linked duplicate. A few sensible informed answers will mention Chris Date's eventual determination that 1NF meanings (other than "is a relation") are not relevant to higher NFs & just seek application-specific chopping up of too-large heterogeneously & homogeneously repetetive values or column sets into just-right sized ones. — philipxy, Oct 01 '19 at 22:17
Please [use text, not images/links, for text--including tables & ERDs.](https://meta.stackoverflow.com/q/285551/3404097) Paraphrase or quote from other text. Give just what you need & relate it to your problem. Use images only for what cannot be expressed as text or to augment text. Images cannot be searched for or cut & pasted. Include a legend/key & explanation with an image. Also, links die. Insert images/links using edit functions. Make your post self-contained. — philipxy, Oct 01 '19 at 22:32
@philipxy — It wouldn't be an altogether a bad idea to up-vote the x-ref'd question. While it's not great, it's not bad and did warrant an answer (a good answer — a much more comprehensible version of my comment here) and will probably become the master for this duplicate. Lemme know when you've read this and I'll remove it. An up-vote for this comment, for example. — Jonathan Leffler, Oct 01 '19 at 22:43

Renzo · Accepted Answer · 2019-10-02T06:14:46.380

You are correct thinking that keeping all the data together can still be qualified as 1NF. Splitting the data is not a necessary condition to have the 1NF.

In fact, this is exactly what differentiate generic “tables” (like those of Excel and similar programs), from “relations”, which is the basic data structure of relational model, structures managed by Database Management Systems, and for which the different normal forms (like 2NF, 3NF, Boyce-Codd Normal Form, etc.) have been “invented”: a relation is like a table, with three very important differences:

Each cell has one and only one value
The order of the rows is not important
The rows are all different

Conditions 2 and 3 means that we are in presence of a set, and note that it is not necessary to have a key (or candidate key, as it is usually called) formed by a strict subset of the attributes: simply a candidate key could be composed by all the attributes, also in this case it is true that all the rows are different.

So, passing from “tables” to “relations” means that all the above three conditions are true, and allow the building of the so-called “normalization theory”, which, even if “theory”, has important practical consequences on data management techniques.

Finally, note that for some people the 1NF can be properly defined only when we are already talking about relations, i.e. sets, so, it reduces to the first condition (each cell has one and only one value). So, when a relation is not in 1NF, one can talk of non-1NF relations, where each datum can have multiple values. This is not really important, since in the normalization theory the “important” normal forms are the Third, the Boyce-Codd’s, and higher ones.

First Normal Form Ambiguity?

1 Answers1