0

For an assignment I have created a database driven web application. I have to show my understanding of normalisation by showing my database in de-normalised form, and then normalising it gradually, explaining what was done at each stage.

The normalisation process at stages 1 to 3 (which is as far as we have to go) I have no trouble understanding.

My database contains 20+ tables and I don't know how I am supposed to represent this is 0NF. The main difficulty is due to the fact that, as I have understood, 0NF data is in a single table. In fact, I don't see any way around this because 0NF has no primary keys, and therefore there would be no way to reference data in other tables.

Am I right in thinking this? Or can I represent 0NF data in multiple tables, which would make this task a lot easier as I wouldn't have a 100+ column table.

philipxy
  • 14,867
  • 6
  • 39
  • 83
Armada
  • 718
  • 8
  • 19
  • Frammo - I'd recommend a Google on "Normalization examples". Many of the hits will give examples of 0NF, 1NF, etc. and they'll show you how to migrate them toward 3NF. This information is out there in very understandable form. – Ed Gibbs Apr 26 '13 at 14:17
  • I've looked - doubt I'll be able to find an example with 100+ columns – Armada Apr 26 '13 at 14:19
  • Denormailization often decreases the number of tables, but not always. In particular denormalization of an already partially denormalized structure often simply _fattens_ some tables while _thinning_ others. – Pieter Geerkens Apr 26 '13 at 14:27
  • Imagine a table that has a student's name, plus columns for each of their phone numbers (`phone1`, `phone2`, etc.), plus columns for each course they're taking along with grades and year and semester and professor and credits (`Course1Name`, `Course1Year`, `Course1Sem`, `Course1Grade`, `Course1Prof`, `Course1Credits`, `Course2Name`, etc.). You'll hit 100+ columns before you know it :) – Ed Gibbs Apr 26 '13 at 14:28

3 Answers3

0

0NF is a single table - like a spreadsheet of data. You wouldn't reference any other tables, you would simply repeat the data in the one table.

For example, imagine a messaging system:

Customer | Recipient | Message
Bob        John        Hello John
John       Bob         Hello
Bob        John        Have you got time to answer a question?
John       Bob         No way

We don't have a table containing the Person to link to, we repeat Bob or John in the customer column and in the recipient column.

Fenton
  • 241,084
  • 71
  • 387
  • 401
  • Would I be able to generate this table by FULL JOINING across all FK relationships in the DB? – Armada Apr 26 '13 at 14:29
  • You would be able to get the data in this format by joining and by not selecting the key columns. – Fenton Apr 26 '13 at 14:32
0

0NF data can occur in multiple tables, each of which may be 0NF, but one table for everything is the worst form.

This may very well be the case of an assignment where you first have to fuck up your spontaneous solution first, so you can show the process of how to make it better.

Adder
  • 5,708
  • 1
  • 28
  • 56
  • In this case how would I reference from one table to another? – Armada Apr 26 '13 at 14:22
  • In the single-table form of a 0NF you would just repeat the data, similar to what a JOIN over two tables would result in. For example, for every Customer named Bob, you would repeat the address data in an address column, and to make it worse you would format the address data as a single string instead of splitting data into multiple columns for often-used repeated data, like the City and Zip-Code. – Adder Apr 26 '13 at 14:25
  • I've just realised I could generate the 0NF table by 'FULL JOIN' ing all my actual tables, right? – Armada Apr 26 '13 at 14:27
  • Yes, and to make things worse you could consider inconsistencies in the data, like having Bob use two addresses instead of one, and then you could explain how the normalization process helps to enforce a business rule of one address per person, or the advantages of having two persons named Bob. – Adder Apr 26 '13 at 14:31
0

You mean "unnormalized" not "de-normalized". The latter is when normalized base tables are replaced by others whose values are always the join of the orignals. You need to find out from whoever gave you the asignment whether unnormalized form here means your first design attempt or specifically a "universal relation" that is an appropriate join of all those. That would be de-normalizing.

Every base table and query result holds the rows that make some predicate (statement parameterized by columns) into a true proposition (statement).

  • SELECT * FROM EMP "employee [E] is named [N] and has dependent [D]"
  • SELECT * FROM DEP "employee [E] works for department [D]"
  • query SELECT E, N FROM EMP
    for some D, "employee [E] is named [N] and has dependent [D]"
    ("employee [E] is named [N] and has some dependent")

An SQL FROM makes a temporary table that you can think of as having columns T.C for each column C of each table T. For inner JOINs (ie INNER, CROSS and plain) this temporary table is a cross join. It's predicate is the AND of the predicates of the joined tables. ON and WHERE conditions are also ANDed into the predicate. The SELECT clause renames the temporary columns so there are no "."s. (Athough SQL does that implicity if there's no ambiguity.)

  • query SELECT EMP.E AS E, N, DEP.D AS D FROM EMP JOIN DEP
        "for some EMP.D, employee [EMP.E] is named [EMP.N] and has dependent [EMP.D]"
    AND "employee [DEP.E] works for department [DEP.D]"
    (ie "employee [E] is named [N] and has some dependent and works for department [D]")

Note that it doesn't matter what constraints hold. (Including UNIQUE, PRIMARY KEY,FOREIGN KEY & CHECK). Constraints just tell you that tables are limited in the values they will ever hold. In fact the constraints are determined by the predicates and the situations that can arise.

If you know that it's always the case that T1.C = T2.C for some column C of tables T1 & T2 then you only have to SELECT one of them, AS C. If every column C is always equal in every table then NATURAL JOIN does the appropriate = and the AS without having to mention any columns.

(More re predicates & SQL.)

PS The single-base version of a database is not a base whose value is the FULL ( OUTER ) JOIN of separate bases. First, normalization does not deal with NULLs, so you would have to remove them from any OUTER JOIN result, more or less giving you your tables back. Second, FULL JOIN is in general not associative, ie (T1 FULL JOIN T2) FULL JOIN T3 <> T1 FULL JOIN (T2 JOIN T3), so there is no such thing as "the FULL JOIN` of more than two tables". Third, even with just two tables their FULL JOIN does not in general allow you to reconstruct their values.

PPS There is no "0th Normal Form". There are different uses of "1st Normal Form". Sometimes it just means being a relation, and sometimes it means being a relation with no relation-valued attributes, and it is also frequently used in various other confused/nonsensible ways that are really about aspects of good design.

philipxy
  • 14,867
  • 6
  • 39
  • 83