4

I'm looking for a schema to handle optional fields. The record with the field present will be unique from the record without the field.

Let's take an ingredient as an example.
An ingredient has a name (title), category name and an optional variety name.

+---------------+-------+----------+  
| ID_Ingredient | Title | Category |  
+---------------+-------+----------+  

An example is an apple. For some recipes, any apple will suffice. But there are different varieties of an apple such as "Granny Smith" and "Red Delicious". Some apple pie recipes call for a blend of Granny Smith and MacIntosh varieties.

In order to handle this dilemma, the variety is optional.

My present schema is to have two tables: Optional_Variety and Variety:

+------------+--------------+  
| ID_Variety | Variety Name |  
+------------+--------------+  

+---------------+------------+
| ID_Ingredient | ID_Variety | // Optional_Variety table  
+---------------+------------+  

If an ingredient has a variety, the variety can be found by searching the Optional_Variety table.

My dilemma is that if an ingredient has a variety, then it should have a unique ID_Ingredient identifer. After all, a Granny Smith apple is different than a Red Delicious apple (both look and taste), so they should be unique ingredients.

My other solution is to have the variety field in the Ingredient record:

+---------------+-------+----------+---------+  
| ID_Ingredient | Title | Category | Variety |  
+---------------+-------+----------+---------+  

My issue with this schema is that many ingredients don't have a variety, and I'd be wasting database space by having empty fields.

In my actual implementation, there are many optional fields for ingredients, so the amount of wasted database space for empty fields would multiply.

The question is:
What is the schema for handling records with an optional field, such that when the field is present, it is different (a.k.a. having a unique identifier) than the record without the field?

Platform: MySql on Windows 7, 64-bit; MySQL Windows Vista, 32-bit;
I'm using C++ ODB with Object Relational Mapping to interface with the database.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154

1 Answers1

1

You would go with three tables:

base_ingredient (id_base_ingredient, ingredient_name, ....)

variety (id_variety, id_ingredient, variety_name)

ingredient (id_ingredient - PK , id_base_ingredient - FK NOT NULL, id_variety NULL)

and you wouldn't need a different id_ingredient for each variety, because you would be storing the ingredient

I was just wondering something else. How many ingredients and variations do you expect to store, because Oracle says storing nulls is pretty cheap (http://dev.mysql.com/doc/refman/5.5/en/storage-requirements.html)

In addition, while a NULL itself does not require any storage space, NDBCLUSTER reserves 4 bytes per row if the table definition contains any columns defined as NULL, up to 32 NULL columns. (If a MySQL Cluster table is defined with more than 32 NULL columns up to 64 NULL columns, then 8 bytes per row is reserved.)

Bogdan
  • 101
  • 5
  • The quote refers to the NDB storage engine, used by MySQL Cluster. [This reference](http://stackoverflow.com/a/230923/1446005) might be a bettter choice. Regardless of the storage requirement of a `NULL` value, a nullable column is precisely the way to model an "optional field". Storage requirement should not dictate the schema design. – RandomSeed Jan 16 '15 at 09:45
  • There looks like a circular reference. In order to store the `variety`, the `id_ingredient` must be known. In order to determine the value for `id_ingredient`, the `ingredient` must be saved, but it has an `id_variety` field which cannot be resolved until the `variety` record is submitted. – Thomas Matthews Jan 19 '15 at 21:04