2

Using the following as an example I am trying to figure out the best way set up a MySql database.

enter image description here

FoodGroupTypeCode: Grain, Vegetable, Fruit, Dairy, Protein

What is the best Data Type to use to represent this attribute?

  1. CHAR(1): G, V, F, D, P
  2. TINYINT: Using number codes (i.e. Grain = 1, Vege = 2)
  3. ENUM
  4. VARCHAR(9): Using full names
  5. Other (explain)

Option 4 is really not a consideration of mine, unless someone can really validate it. I have read a lot of "opinions" on the subject but looking for more concrete reasons for picking one over the other. I appreciate any input that someone can give to that degree rather then something such as "I like ENUM because it is fast."

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
swisscheese
  • 1,765
  • 4
  • 23
  • 25

5 Answers5

2

CHAR(1): G, V, F, D, P
Performant, storage efficient and results in readable code. I favour this solution when I know that the values are stable, and when I need to attach behaviour to the codes. (if code = 'G' then specific behaviour).

TINYINT: Using number codes (i.e. Grain = 1, Vege = 2)
Performant, storage efficient, but results in less readable code. This solution is also immune to confusion if values change (G was initially Grain, but now it is changed into Brain). This is most common.

ENUM
I don't see a single benefit of using enum other than a tiny performance benefit. Of course, the small performance gain will never be enough if you are truly having a performance problem.

VARCHAR(9): Using full names
Results in readable code. Certain queries will have less joins which in those specific cases provides performance benefits. Not storage efficient, and may give you problems if you want to change the values. If you have lots of tables with lots of rows and they are all referencing this table, think carefully before you take the risk.

Ronnis
  • 12,593
  • 2
  • 32
  • 52
  • @Ronnis Thanks for walking through each. Currently I am using CHAR(1,2,3) for these scenarios but have many tables that mirror the format of the tables in my question. It is easy for me to keep a cheat sheet for what everything stands for but wondering if I was taking the best approach even though the definitions do not exist within the database itself. Do you see any scaling problems with using CHAR? Scaling in relation to either the number of definitions for a given attribute or the number of tables using this format in a database? – swisscheese Mar 18 '11 at 19:46
  • You might as well add the definitions as reference tables, because sooner or later you will need populate a Combobox with those values :) More importantly though, it helps you enforce the integrity of the data. I don't see a scaling issue at all with using CHARs, as long as you don't go crazy with CHAR(100) or something. Keep them short but meaningful. – Ronnis Mar 18 '11 at 19:53
  • @Ronnis Thanks for the help. When you use Lookup / Reference tables is there a dedicated table for each set of definitions? Or can one table handle all of the definitions for the entire DB? – swisscheese Mar 20 '11 at 05:19
  • One for each. Remember to use pk/fk constraints on them. – Ronnis Mar 20 '11 at 09:03
  • @Ronnis I think I understood what a lookup table was without realizing it is being called a lookup table. With further reading I believe I have already modeled that. Right? Isn't the FoodGroupType my lookup table? The FoodGroupTypeCode is G, V, F, D, P and the description is Grain, Vegetable, Fruit, Dairy, Protein. A row will exist for each. – swisscheese Mar 20 '11 at 09:55
1

I would recommend ENUM, because it would constrain your possible selections to Grain, Vegetable, Fruit, Dairy, Protein.

However, in the database I usually do this as a VARCHAR(2) with a Check Constraint (can only be Grain, Vegetable, Fruit, Dairy, or Protein), then represent it in my app as an ENUM or List.

dseibert
  • 1,319
  • 9
  • 19
  • but then you lose the description – Jacob Schoen Mar 18 '11 at 17:52
  • Although I want to keep all the constraints within the DB as much as possible there are multiple ways to skin the cat to do so. Here are some agruements against ENUM that I came across [SO ENUM Question](http://stackoverflow.com/questions/4297149/enumyes-no-vs-tinyint-which-one-to-use), but I want a direct comparison to the example in my problem. – swisscheese Mar 18 '11 at 18:06
1

The choice of appropriate datatype entirely depends on how you'll be dealing with the table data in your application (if any), if you're having this table just for storing the records than VARCHAR would be a better option, since it'll make your data more verbose, but if there's a processing to be done on the table data like, filter results on certain criteria, long names could be confusing when it comes to comparing them in the application logic, and in such situations, CHAR or TINYINT would be a better option. While the performance impact based on the datatypes used should be a second priority.

Kushal
  • 3,112
  • 10
  • 50
  • 79
0

I'd use CREATE TABLE FoodGroupType (id int unsigned not null auto_increment PRIMARY KEY, code VARCHAR(20), description VARCHAR(100). So Food table will look

CREATE TABLE Food (id int unsigned not null auto_increment PRIMARY KEY, 
food_group_type_id int unsigned not null, name VARCHAR(50), description VARCHAR(100),
FOREIGN KEY(food_group_type_id) REFERENCES FoodGroupType(id) ON DELETE ... ON UPDATE ...
a1ex07
  • 36,826
  • 12
  • 90
  • 103
0

I personally would not go with Char(1) or tinyint because they are not descriptive. Looking at the data it is not as clear as it could be. Varchar and enum would be the two I would consider. I like the idea of enum though, as that is really what it is. You are saying all foods have to be one of these food types.

Plus I hear enum is fast!

Jacob Schoen
  • 14,034
  • 15
  • 82
  • 102
  • 1
    How much of a worry is it that the descriptive nature is held in the database rather then the relation that are in options 1 and 2? – swisscheese Mar 18 '11 at 18:10