Exactly what is 2NF and 3NF?

Question

What's the main point of Normalization?

I mean if a normal form is not in 2NF, it is because of partial dependency i.e. a non key attribute is dependent on a part of a candidate key. So, let's say, for a relation R(A,B,C) with FDs:

AB->C, B->C

Clearly, AB is the candidate key and B->C is the partial dependency.

Solution: Decompose the relation such that (B,C) forms a new relation with B as the key.

Now, if a relation is not in 3NF, it is because a non key attribute is dependent on another non key attribute i.e. to say if FDs for a relation R(A,B,C) are:

A->B,B->C

Clearly, A is the key and B->C shows transitive dependency, so not in 3NF.

Solution: Decompose the relation such that (B,C) forms a new relation with B as the key.

So, what's the exact difference? I mean, why such a marked distinction? Essentially in both of the cases the action is same. Decompose the relation using the dependency where the determinant (B here) is either PART of a key or not. Why have separate terms like partial dependency or transitive dependency? Why not just see, if there exists a dependency wherein a non prime attribute is determined by a something which is NOT a candidate key( no matter whether it is a partial key or another non prime attribute )

Why can't we implement a method like this:

1 NF -- having all elements in the atomic form
X NF -- if there's any dependency of the form non_key -> non_prime_attribute(s) , decompose the relation with one of the new relation having this particular "non_key" as the key with those non_prime_attributes.
BCNF : Where for all the dependencies of the form X->Y, X is a superkey?

Can we have such NF condition format? Does it combine all the conditions?

Hi. Your language was not clear before, and now you have added more that is really unclear. But as I said on a comment on my answer about your addition when you put it in a comment, my extended answer seems to address your addition. — philipxy, Mar 03 '18 at 03:56
Your question about why don't we study this way is, perhaps, the heart of the matter. I haven't studied this stuff for more than 30 years, and I still tend to respond as if the concepts were in exactly the same state now as they were back then. It's possible that newer texts or tutorials have come up with better ways of organizing and presenting the material. — Walter Mitty, Mar 03 '18 at 20:22
Meanwhile, when I first started posting on SO, many newbies had never even heard of normalization. Nowadays, a lot of participants breezily toss off, "normalized design" as though it were merely a synonym for "good design". — Walter Mitty, Mar 03 '18 at 20:25

philipxy · Answer 1 · 2018-03-03T03:49:59.760

So, what's the exact difference?

2NF is not 3NF & definitions of 2NF are not definitions of 3NF. There isn't any particular semantic or syntactic structural similarity that would leave some kind of "difference" other than that a 2NF relation can have the sort of problem FD (functional dependency) that violates 3NF that a 3NF relation doesn't have. You can find definitions all over the place. You almost give them correctly here yourself. But a NF (normal form) is a condition, not a process. What do you mean "actions are the same"? Being in 3NF implies being in 2NF, so naturally decomposing to 3NF also gives 2NF. But there are relations that are in 2NF but not in 3NF, and there may be decompositions for a relation to 2NF that don't get to 3NF. Those decompositions will involve in a removal of all problem partial FDs that does not result in the removal of all problem transitive FDs.

(Because 3NF is always achievable and there are no other disadvantages compared to 2NF, 2NF isn't even useful. It's just a condition that was discovered first that is not as strong as 3NF.)

(3NF is frequently defined in terms of 2NF plus no transitive dependencies of non-prime attributes on CKs, but actually no such FDs implies no partial FDs of non-prime attributes on CKs, hence 2NF, so the first condition is redundant.)

Why not just see, if there exists a dependency wherein a non prime attribute is determined by a something which is NOT a candidate key

Why should that condition be helpful? It is not a description of just getting rid of the problem FDs of 2NF & 3NF--that's what putting into 3NF does.

Getting rid of non-trivial FDs that are not determined by superkeys happens to give BCNF. It implies 2NF & 3NF. But it is different from both of them. A BCNF relation exhibits no FD-based update anomalies. It is always achievable. However 3NF is alway achievable while "preserving FDs", whereas BCNF is not. There are cases where in order for a FD that held in the original to be enforced in a view/query that gives it via constraints on its components we need an EQD (equality dependency) constraint. That says two column sets have the same set of subrow values, which is more expensive to enforce than a FD. Either you have BCNF & an EQD & fewer update anomalies or you have 3NF/EKNF & a FD & certain update anomalies.

The NF that really matters is 5NF, which implies BCNF, with no update anomalies & with other benefits. (We might then decide to denormalize for performance reasons.)

PS Normalization to a given NF does not necessarily involve normalization to lower NFs.

Re "why don't we study this way?" My answer *is* how normalization to higher NFs is taught. But there is *more* beyond 3NF. So I don't understand your sentence. And the rest of your comment is not clear. You could ask another question, but you need to use enough sentences to clearly say what you mean. But see my new answer section re BCNF. I appended it before I saw your comment but maybe it answers it. PS A database teaching problem is [there is no single notion of "1NF" & "atomic" is nonsense with nothing to do with normalization to higher NFs](https://stackoverflow.com/a/40640962/3404097). — philipxy, Mar 03 '18 at 03:46
The NFs were defined in terms of their predecessors. If a table is not in 2NF and it is desired to put it in 3NF, the first step is to put it in 2NF. It's taught that way as a consequence of the defnitions. If there were an easly teachable method for taking a table in 1NF and putting it in BCNF, that might be better than what is now taught. That still leaves 4NF and 5NF unaddressed. — Walter Mitty, Mar 03 '18 at 12:20
@WalterMitty It is not necessary to define NFs in terms of predecessors, there are many for 2NF & 3NF that don't, and the very BCNF definition here doesn't. First normalizing to a lower NF than one ultimately wants can prevent further normalization from finding good designs. There are algorithms for moving directly into 3NF (actually EKNF) or BCNF, as many textbooks present. [Why aren't all the normal forms used?](https://stackoverflow.com/a/34468720/3404097) [Skipping steps in Normalization?](https://stackoverflow.com/a/32740014/3404097) — philipxy, Mar 03 '18 at 13:41
1) I love the idea of the days of yore when 2NF had been discovered but 3NF had yet to be :) 2) Never heard of a practical reason why 3NF might be favoured over BCNF, is appreciated.. 3) You could mention that 5NF is always achievable. — onedaywhen, Mar 13 '18 at 14:29

Walter Mitty · Answer 2 · 2018-03-03T19:56:30.050

It almost sounds as though you want to know why they called these two normal forms by different names instead of inventing just one form that covers both cases. If that's not the case, please ignore this answer.

Part of the answer is that the forms weren't discovered at the same time. And part of the answer is that the problem with 1NF that gave rise to 2NF is not the same as the problem with 2NF that gave rise to 3NF, even though they both exhibit harmful redundancy.

What might satisfy you a little more is BCNF. BCNF was actually discovered later than 4NF, so that name was already in use. But BCNF has to be placed between 3NF and 4NF, because it is more restrictive than 3NF but less restrictive than 4NF. So it was discovered "out of sequence", so to speak.

In BCNF, every (non trivial) determinant is a candidate key. That seems to be what you are looking for. I conjecture that any relation that is in 1NF and where every determinant is a candidate key, could be shown to be in 2NF and 3NF. But the proof is beyond me.

Telling people wrong things impedes their getting things right. Also it's not trivial determinants, it's trivial FDs, my comment should have been, BCNF is iff every determinant of a *non-trivial* FD is a *superkey*. (According to your answer, for R on (a,b,c) with cover {a->bc}--which has CK set {a} & is in BCNF--it's not in BCNF since there are non-CK determinants since b->b holds (trivial FD) or since ab->c holds (non-CK superkey determinant).) — philipxy, Mar 03 '18 at 14:02

score 1 · Answer 3 · answered Mar 03 '18 at 07:57

2NF and 3NF are essentially historical concepts and your question is a reasonable one. There is no real reason to apply them in practical database design because better tools exist today.

When it comes to teaching there is possibly some justification for mentioning 2NF and 3NF. Doing so allows students to explore the concepts involved (as you have done) while also teaching them a bit about the origins and rationale of design theory. In school maths lessons I was taught long division and differentiation from first principles. No one uses those techniques in practice, they are just teaching aids.

You make a very good point about the difference between good teaching practices and good current practices. When I learned this stuff in the 1980s, normalization was still a novel concept. Where I worked, there were more databases using the CODASYL model than using the relational model. — Walter Mitty, Mar 03 '18 at 20:12

score 0 · Answer 4 · answered Mar 03 '18 at 01:44

Before checking for 2NF the relation should be in 1NF. In simple words 2NF have only full dependencies only, no partial dependencies in relation. Full dependency means if x gives y, then by removal of any element in x, then y is not having any relation. If by removal of x, you are having relation with y then it is partial dependency. For 3NF we have to check for the 2NF, in 3NF we should not have any transitive relations like if x gives z, then there is no relation like x gives y and y gives z.

Solution for 2NF create a table for the partial dependcies and add foreign key in new relation which is primary key on the previous relation.

Solution for 3NF create a relation for both x gives y and y gives z. Add keys to relations.

Exactly what is 2NF and 3NF?

4 Answers4