-2

I have been studying database design and programming for quite some time now, but I still can't get a grasp of understanding each individual normal form (1NF, 2NF, 3NF.)

Seeing as anytime the data is in Third Normal Form, it is already automatically in Second and First Normal Form, can the whole process actually be accomplished less tediously by fully normalizing the data from the start. I can accomplish this easily by arranging the data so that the columns in each table, other than the primary key, are dependent only on the whole primary key.

How important is it to understand each individual normal form if we can simply fully normalize the data less tediously by doing what I have described?

EDIT: What I'm ultimately asking is: Is it important to go through the steps of each normal form when normalizing data, or is it appropriate to just go to Third Normal Form seeing as the result is ultimately the same?

philipxy
  • 14,867
  • 6
  • 39
  • 83
Zampanò
  • 574
  • 3
  • 11
  • 33
  • 1
    I feel like this should be asked at [Database Administrators](https://dba.stackexchange.com/). – Mikey May 24 '18 at 01:27
  • 1
    The only NF that matters is 5NF. Anyway if you *wanted* a lower NF you should normalize to 5NF then denormalize. The main reason people settle for lower NFs is ignorance. There are certain costs & benefits, but people don't know or use them. Normalization to a given NF is not done by going through lower NFs, one uses an appropriate algorithm for the NF one wants. This is made clear by most textbooks, although some wrongly say to move through lower NFs, but putting into a lower NF can prevent good higher-NF versions of the original from turning up later. What is your textbook? – philipxy May 28 '18 at 05:27
  • My textbook is Oracle Database 12c The Complete Reference by Bob Bryla and Kevin Loney. – Zampanò May 28 '18 at 18:54
  • It is essential to understand Normalization when we are using Relational Databases. Databases are used to store data in tables, and usually, we normalise tables to create a relational schema. – vCillusion May 29 '18 at 08:42
  • Please clarify by editing your post, not by commenting. A manual for a product recording & manipulating designs is not an appropriate place to learn about designing. The book you mention is not an academic textbook. The 10g version's adressing of normalization is garbage. Dozens of published academic textbooks are free online in pdf. – philipxy May 31 '18 at 20:01

5 Answers5

4

I highly recommend understanding each normal form as this will help you determine or investigate any issues with a current database may have as sometimes you might not have the perfect scenario each time and understanding each normal form will help you to understand the current problems with an existing database design if there are any.

Going through step by step through the different normal forms will help you to figure out why we do this and this is to achieve the goals specified by E. F. Codd. The objectives of normalization were stated as follows: 1. To free the collection of relations from undesirable insertion, update and deletion dependencies. 2. To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs. 3. To make the relational model more informative to users. 4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.

Here is a image to help you understand the different normal forms better. P.S. BCNF is actually 3.5NF not 4NF

1NF, 2NF, 3NF, BCNF

Wilson M
  • 373
  • 2
  • 8
  • Normalization to a given NF is not done by going through lower NFs; one uses an appropriate algorithm for the NF one wants. Also the criteria in the image are wrong. Also if you got this from somewhere, give a reference so you are not plagiarizing & don't use that reference, because it is wrong. – philipxy Jun 06 '18 at 18:28
1

It's right that, when being in the 3. NF, you're also in the 2. and in the 1. NF. However, the only condition for the 3. NF is not only that all the data is only dependent on the whole candidate key. It also has the condition that it already is in the 2. NF, meaning that every property that is not the candidate key has to fully depend on the candidate key and that it is in the 1. NF, meaning that every column has to be atomar. So yes, it is important to understand every NF if you want to have a table in the 3. NF.

I'll try to explain the Normal Forms to you:

1. NF

The 1. NF states that every column has to be atomar. This means, there shouldn't be multiple items of data in one column. For example, the adress of someone shouldn't be stored in one column, but should be splitted in the country, the state, the street and so on. Each of these pieces of data should then be stored in their own column.

2. NF

The 2. NF states that every attribute, that is no part of the candidate key, has to be identificable only by the whole candidate key. That means for example that you shouldn't store books and printing labels in one table. Because then the name of the book would only be dependent on the id of the book, while the printing label's name would only be dependent of the id of the printing label and not of the whole candidate key.

3. NF

The 3. NF nearly states the same as the 2.: No column is allowed to be dependent on a non candidate key column. That means for example that you shouldn't store the IBAN of a book and an id of the book in the same table, with only the id being the candidate key, as you'd only need the IBAN to find the name to the book.


If this doesn't explain the matter well enough, there's a lot of information online regarding the normal forms (like Wikipedia).

MetaColon
  • 2,895
  • 3
  • 16
  • 38
  • @philipxy I'm sorry, this was a translation mistake I made. I corrected my post. And it is in fact worth noting that the definitions are strongly simplified and are only ment to make the matter understandable, not to be bullet-proof as the original definitions. – MetaColon Jun 02 '18 at 23:09
-1

its not the case that if its in 3 NF its in 1 NF nad 2 nd NF .it was like if its in 2nd NF it has to be in 1st NF beforehand .and same goes for 3NF .for normalising to 3NF it has to clear 1st and 2nd NF forms.

1st normal forms states that no multivalued attribute should be present. 2NF states that there should not be partial dependency on a non prime attribute . 3NF states that no transitive depedency should be there .

thank you

Nikhil S
  • 3,786
  • 4
  • 18
  • 32
  • This is unclear & wrong. Eg "its not the case that if its in 3 NF its in 1 NF nad 2 nd NF" is wrong & "if its in 2nd NF it has to be in 1st NF beforehand" is confused--being in 2NF is a *condition* (which implies being in 1NF) but certain *definitions* have multiple (simultaneous not ordered) parts and an *algorithm* might step through lower NFs (but the standard 3NF algorithm *doesn't*). Eg your definitions are wrong. Also, this doesn't answer the question. PS Whatever you are trying to say, please use enough clear sentences to say what you mean. – philipxy Jun 02 '18 at 23:18
-1

The only NF (normal form) that matters is 5NF.

A relation (value or variable) is in 5NF when for every way it can be losslessly decomposed the components can be joined back in some order where the common columns of each join are a superkey of the original. (Fagin's PJ/NF paper's membership algorithm.)

This allows a table to be the join of others with overlapping meanings but without update anomalies. (Although update anomalies cease at ETNF, between 4NF & 5NF.)

Anyway if you wanted a lower NF you should normalize to 5NF then denormalize. The main reason people settle for lower NFs is ignorance. There are certain costs & benefits, but people don't know or address them--code must restrict updates to account for the problematic update anomalies. Normalization to a given NF is not done by going through lower NFs; one uses an appropriate algorithm for the NF one wants. (This is made clear by most textbooks, although some wrongly say to move through lower NFs, but putting into a lower NF can prevent good higher-NF versions of the original from turning up later.)

PS There is no single notion 1NF and all it has in common with higher NFs is that both seek "better" designs.

halfer
  • 19,824
  • 17
  • 99
  • 186
philipxy
  • 14,867
  • 6
  • 39
  • 83
  • 1
    This doesn't really answer the question. – MetaColon Jun 03 '18 at 07:54
  • What question do you think is not answered? – philipxy Jun 04 '18 at 16:43
  • _What I'm ultimately asking is: Is it important to go through the steps of each normal form when normalizing data, or is it appropriate to just go to Third Normal Form seeing as the result is ultimately the same?_ – MetaColon Jun 04 '18 at 17:05
  • @MetaColon "Normalization to a given NF is not done by going through lower NFs; one uses an appropriate algorithm for the NF one wants." Also I then explain how the result is *not* "ultimately the same". – philipxy Jun 06 '18 at 18:24
-1

From what I recall of the process, it's a method that you follow to get to a state where the storage and search facilities of the database are fully optimised. Yes 3NF does encapsulate the rules below it, 1st and 2nd, but it is far easier to unpick the data if you start at the easier forms of normalization to see if your data is in an efficient format for storage in a RDBMS or SQL based database. Jumping in straight at a higher normal form makes the whole process for beginners harder and intimidating and to not analyse the data correctly. To be honest will make hard work when dealing with difficult data structures that are not just your usual invoice, invoice Lines, address stuff that you tend to deal with day in and day out. Going through the process of normalization, sometimes there is value in unpicking data structures that were not obvious from the start, which not only makes your data more efficient but also helps you reason over what you are trying to accomplish.

WeNeedAnswers
  • 4,620
  • 2
  • 32
  • 47