The predicate is:
[p1]
Person named (FirstName) (LastName) is assigned ticket number (Ticket).
Constraints:
(c1.1)
Person is identified by first name and last name.
(c1.2)
Ticket is identified by ticket number.
(c1.3)
For each person, that person may be assigned more than one ticket.
(c1.4)
For each ticket, that ticket may be assigned to more than one person.
PersonTicket {FirstName, LastName, Ticket}
KEY {FirstName, LastName, Ticket}
NF Reasoning:
- The table is all-key, hence it is in the BCNF.
- It is in the BCNF, it is not possible to reduce redundancy by decomposition, hence it is in the 5th NF.
- It is in the 5th NF and all key, hence it is in the 6th NF.
Note that the reasoning about NF does not require any data, it must be valid for an empty relational variable (table) and any possible valid values that the table may hold.
Also note that the predicate [p1]
is a simple predicate, it generalises simple (elementary) facts, which is a telltale sign of the 6th NF. For example, the fact "Person named Oli Seitz is assigned ticket number 1.
"
can not be split into two sentences without losing information.
EDIT
It has been a long discussion (see comments), so I have to expand the answer a bit.
My basic premise is that the table {Ticket, Vname, Nname}
can not be decomposed into projections, or technically speaking that join dependency:
JD *{{Ticket, Vname}, {Vname, Nname}, {Ticket, Nname}}
does not hold. (note: Vname = first name, Nname =last name).
So let's take a closer look at this. What would it mean for the JD to hold? The JD can be verbalized as :
IF a person with a first name is assigned a ticket number AND a
person with a last name is assigned that ticket number AND there
exist a person with that first name and that last name THEN that
person with that first name and that last name is assigned that ticket
number.
As you first read this, it may feel right, -- for a minute or two -- but is plain wrong.
Take a look at the example:
Original | Three Projections
|
(T, First, Last ) | (T, First) (First, Last ) (T, Last )
-----------------------------------------------------------------
(1, 'Tom', '.....') | (1, 'Tom') ('Tom', '.....') (1, '.....')
(1, '...', 'Jones') | (1, '...') ('...', 'Jones') (1, 'Jones')
(2, 'Tom', 'Jones') | (2, 'Tom') ('Tom', 'Jones') (2, 'Jones')
Note that join of tuples
(1, 'Tom) ('Tom', 'Jones') (1, 'Jones')
produces an extra tuple
(1, 'Tom', 'Jones')
which is not in the original; hence the JD does not hold.
To put it loosely, a person identifier (First, Last)
is split. So, person's first name and last name are acting independently of each other; a severe logical error.
It seems that the sample data in the original question was carefully crafted to appear the JD holds;
the original data-sample will pass this test. However, insert just one tuple (row) in the original table:
(1, 'Tim', 'Seitz')
Now test again, this one generates two extra tuples when joining projections:
(1, 'Tim', 'Hofmann')
(2, 'Tim', 'Seitz')
In the reasoning section I have used terms like: simple predicate, elementary fact, information, identifier, and reducing redundancy. In the comments below these have been labelled as: fuzzy non-technical, unsound, wrong, vague, irrelevant, and nonsensical.
It is interesting that these fuzzy, vague, nonsensical, etc. terms lead to the correct result, as opposed to a "precise, highly technical, (algorithm assisted ?)" method which can result in severe logical errors.
So, for those who would like to take a closer look at these -- IMHO extremely useful and practical -- terms, I can suggest a few references.
[1]
About simple predicates and 6th NF:
C. J. Date; Database Design and Relational Theory;
Part III, Chapter 13, Sixth Normal Form
[2]
About elementary facts:
Terry Halpin, Tony Morgan; Information Modeling and Relational Databases;
Chapter 3, Conceptual Modeling: First Steps
Terry Halpin, What is an Elementary Fact?, paper 1993.
[3]
About 5th NF and reducing redundancy:
"To say that relvar R is in 5th NF is to say further non loss
decomposition of R into projections may be possible, but it won't
eliminate any redundancies."
"5NF guarantees freedom from redundancies that can be removed via
projection."
C. J. Date; Master Class (video): Database Design and Relational Theory;
Normalization JDs and 5NF (formal)-Part 2 of 2
[4]
About predicates and propositions:
Part I, Chapter 2, Predicates and Propositions
Part V, Chapter 15, Database Design is Predicate Design
in : C. J. Date; Database Design and Relational Theory.
"Conceptually, the database is a set of sentences expressing
propositions taken to be true of the UoD."
Terry Halpin, Tony Morgan; Information Modeling and Relational Databases;
Chapter 2, Information Levels and Frameworks
EDIT 2
And finally, here is some code (PostgreSQL) to test.
CREATE TABLE tbl_0 (
Ticket integer NOT NULL
, Vname text NOT NULL
, Nname text NOT NULL
, CONSTRAINT pk_tbl PRIMARY KEY (Ticket, Vname, Nname)
);
INSERT INTO tbl_0 (Ticket, Vname, Nname)
VALUES
(1, 'Oli', 'Seitz')
, (1, 'Andi', 'Hofmann')
, (2, 'Oli', 'Seitz')
, (2, 'Oli', 'Schmidt')
, (2, 'Tim', 'Schmidt')
, (3, 'Tim', 'Hofmann')
;
The following query generates three projections, joins them together and shows the difference between the original table and joined projections. In other words, does join dependency
JD *{{Ticket, Vname}, {Vname, Nname}, {Ticket, Nname}}
hold?
WITH
p_1 AS ( -- projection {Ticket, Vname}
SELECT DISTINCT Ticket, Vname FROM tbl_0
),
p_2 AS ( -- projection {Vname, Nname}
SELECT DISTINCT Vname, Nname FROM tbl_0
),
p_3 AS ( -- projection {Ticket, Nname}
SELECT DISTINCT Ticket, Nname FROM tbl_0
),
j_pro AS ( -- join projections
SELECT Ticket, Vname, Nname
FROM p_1
JOIN p_2 USING (Vname)
JOIN p_3 USING (Ticket, Nname)
),
d_0 AS ( -- tbl_0 - j_pro
SELECT Ticket, Vname, Nname FROM tbl_0
EXCEPT
SELECT Ticket, Vname, Nname FROM j_pro
),
d_1 AS ( -- j_pro - tbl_0
SELECT Ticket, Vname, Nname FROM j_pro
EXCEPT
SELECT Ticket, Vname, Nname FROM tbl_0
)
-- diff = (tbl_0 - j_pro) union (j_pro - tbl_0)
SELECT Ticket, Vname, Nname FROM d_0
UNION
SELECT Ticket, Vname, Nname FROM d_1
ORDER BY Ticket, Vname, Nname
;
In the first run, all looks ok -- appears that the JD holds, no difference:
+--------+-------+---------+
| ticket | vname | nname |
+--------+-------+---------+
However it's a trick question, the example was carefully crafted to produce this result. All it takes is adding one more row to reveal the problem:
INSERT INTO tbl_0 (Ticket, Vname, Nname)
VALUES (1, 'Tim', 'Seitz');
And now we get two extra tuples (run the query again).
+--------+-------+---------+
| ticket | vname | nname |
+--------+-------+---------+
| 1 | Tim | Hofmann |
| 2 | Tim | Seitz |
+--------+-------+---------+
To make it clear here is the table and the join of the three projections.
-- Added one more row (1, Tim, Seitz)
+--------+-------+---------+
| ticket | vname | nname |
+--------+-------+---------+
| 1 | Andi | Hofmann |
| 1 | Oli | Seitz |
| 1 | Tim | Seitz |
| 2 | Oli | Schmidt |
| 2 | Oli | Seitz |
| 2 | Tim | Schmidt |
| 3 | Tim | Hofmann |
+--------+-------+---------+
-- JOIN {{Ticket, Vname}, {Vname, Nname}, {Ticket, Nname}}
-- generates two extra rows
-- (1, Tim, Hoffman) and (2, Tim, Seitz)
+--------+-------+---------+
| ticket | vname | nname |
+--------+-------+---------+
| 1 | Andi | Hofmann |
| 1 | Oli | Seitz |
| 1 | Tim | Hofmann |
| 1 | Tim | Seitz |
| 2 | Oli | Schmidt |
| 2 | Oli | Seitz |
| 2 | Tim | Schmidt |
| 2 | Tim | Seitz |
| 3 | Tim | Hofmann |
+--------+-------+---------+
Bottom line, the JD does not hold, the original table {Ticket, Vname, Nname}
can not be decomposed into projections and is in 6th NF as stated in the first part of this -- too long by now -- answer.
Final Thoughts
And finally, what does it take to determine NF of a relational variable (table in 1NF)? The usual answer is: heading, functional dependencies, and join dependencies. But heading is a set of free variables from the matching predicate. Dependencies are simply formalized notation of constraints. So what does it take? Predicate and the matching set of constraints, that is all. How much data, how many rows? None, not a single tuple (row) is needed. It is good to have few samples to make sure we understand constraints properly, but not necessary. Focusing on sample data is a huge mistake. Relvar (table) is in a specific NF for any set of valid tuples (rows), empty set included. It should not change NF as data keeps streaming in, if implemented properly. The NF is determined on logical design level, way before SQL CREATE TABLE
is executed.