Which table design is prefered when considering database normalization rules:
Consider the following designs for a node table that describes an ordered tree. The key is a composite key.
- ck: composite key
- parent: parent node
- next: the next node, where the
next.parent = parent.
(this defines a forward linked list) - sk: surrogate key
design 1:
node(ck_x, ck_y, parent_ck_x, parent_ck_y, next_ck_x, next_ck_y)
design 2:
node(sk, ck_x, ck_y, parent_ck_x, parent_ck_y, next_ck_x, next_ck_y)
design 3:
node(sk, ck_x, ck_y, parent_sk, next_sk)
The first design has 6 columns. The second design adds a surragete key and has 7 columsn. The third design has 5 colums it uses the surrogate to save a column.
Are there any normalization rules (or other database design rules) that prefer one design over the others?
update
Alternative designs: subtype the node table, isParent flag, nested sets. These designs have a larger read/write complexity.
design 4:
This design splits the table into 3 tables. The parent and next table contain a mutually exclusive subset of the keys from the node table. It uses
2+4=6
columns for each node.node(ck_x, ck_y)
parent(ck_x, ck_y, parent_ck_x, parent_ck_y)
next(ck_x, ck_y, next_ck_x, next_ck_y)
design 5:
This design uses a isParent flag to indicate that the next item is the parent. It uses
4+1=5
columns, and 1 column is only a bit. Which is less space than 5 columns as used in design 3)node(ck_x, ck_y, next_ck_x, next_ck_y, isParent)
design 6:
This design uses a nested set to create an ordered tree. The composite key is no longer used to define the parents or the order of the children. It uses
2+2=4
columns. But the lower and upper bound column should be both using thesizeof(ck_x)+sizeof(ck_y)
which is equal the space of 6 columns used in design 1.node(ck_x, ck_y, lowerBound, upperBound)
update
design 7:
this uses a index for the position of the node.
node(ck_x, ck_y, parent_ck_x, parent_ck_y, index)
Notes
Using the previous node i.s.o. the next node reduces creation and addition to a single insert compared to insert and update.
Normalization is not about the number of columns or tables.