I'm a bit confused about a certain piece of database normalization and thought I'd ask StackOverflow:
Imagine you have the following relations that relate products to colors. Notice that Product 1 and Product 2 both use the same set of colors (Blue and Green).
Product_Color Color
+-------------+-------------+ +-------------+-------------+
| Product* | Color* | | ColorId* | Name |
+-------------+-------------+ +-------------+-------------+
| 1 | 1 | | 1 | Blue |
| 1 | 2 | | 2 | Green |
| 2 | 1 | +-------------+-------------+
| 2 | 2 |
+-------------+-------------+
If I create two new relations, ColorSet and ColorSet_Color, I can display the same information by joining the 4 relations together.
Product_ColorSet: ColorSet_Color:
+-------------+-------------+ +-------------+-------------+
| Product* | ColorSetId* | | ColorSetId* | ColorId* |
+---------------------------+ +-------------+-------------+
| 1 | 1 | | 1 | 1 |
| 2 | 1 | | 1 | 2 |
+-------------+-------------+ +---------- --+-------------+
ColorSet: Color:
+-------------+ +-------------+-------------+
| ColorSetId* | | ColorId* | Name |
+-------------+ +-------------+-------------+
| 1 | | 1 | Blue |
| 2 | | 2 | Green |
+-------------+ +----------[--+-------------+
At this point if I had a large Product_Color table, with a reasonable degree of shared groups of colors, I would stand to gain considerably from a space perspective.
What is the technical name for this operation in the context of database normalization? I'm clearly removing redundant information even though the entity I've created doesn't actually exist, it's rather more just random chance that there is a lot of overlap. What specifically am I changing by doing this?
Furthermore, it seems like I could arbitrarily do this to most entities. What puzzles me is that Product_Color and Color are already in 6th normal form when we started the exercise (right?).