With a reference table, should I use the ID or the text field?

Question

I have a reference table item_type with a field id and a field name. In my application, items should be treated differently depending on their item_type.

In my application code, should my conditional logic check against the id or the name? Or is there another best practice for this?

Edit: id will be the primary key, but name will still be unique; these will be options in a drop-down list, and will determine what type of item the user is adding. Later on, these will determine how the item is treated.

If id and name are both unique, you can use either, but typically it's good to use whichever is the primary key. However, this depends on your need. Can you post your table and the query you're having trouble with? You'd likely get a better answer with that. — AdamMc331, Oct 27 '14 at 19:39

score 1 · Answer 1 · edited May 23 '17 at 12:20

This might be a tricky situation.

Given that you have this question indicates, that your design does not specifies which keys you do have for this particular table. I would call it an incomplete design, please, revisit this part and explicitly specify all keys that exists for all your relations;
If you have both, id and name (well, name is not a good naming for the column as we pass it, as it is a reserved word), I assume that id is added artificially, i.e. this column is not necessary to describe your Data. This is a so called surrogate key. You should be careful with those — they do not save you from duplicates! Imagine a case, where you'll have:
```
 id |   name
----+--------
  1 | type_a
  2 | type_b
  3 | type_a
```
Despite the fact you have your id being a primary key here and all it's values being unique, you still do have data duplication;
Therefore you must create 2 keys here: Primary on id column and Unique on name column. Now, this is not a bad situation per se, but make sure you have both keys.

Personally, I use the following rules:

if table is a dictionary with small (up to 10) number of values, I use:
- only 1 column in this table, making it varchar (or text rather);
- name this column equally to the table;
- make this column a Primary key.
This keeps the table clean and small. And I do prefer to use dedicated tables over ENUMs.
if I know, that number of entries will grow or I will need to add more columns along the way, I will:
- create a dedicated numeric column, naming it like <table_name>_id (e.g. customer_id), and make it a primary key;
- use this PK elsewhere in the Data Model;
- create a Unique constraint on the real data to avoid data duplication (this is mandatory).

EDIT: I don't see the need to use id as a surrogate key at all for such a tiny thing. I doubt values in this table will change frequently, if they'll do at all. The cost of introducing artificial key — it will be a must to join to this table in order to check for entries of a certain type. Whereas natural text key will allow you to avoid this and use queries like:

SELECT * FROM item WHERE item_type='type_a';

I recommend to take a look at this question: Is there a REAL performance difference between INT and VARCHAR primary keys?

At the end — you should know your design and you should do performance testing of it. This will give you the real knowledge bout what is best for you.

score 0 · Answer 2 · answered Oct 27 '14 at 20:11

If you have code like this:

TABLE ITEMS:
id
type_id
name
etc_etc

ITEM_TYPES
id
name

Go by id. ALWAYS use int column, preferable index columns (liek your primary auto_incrementing ID column), unless you have a good reason. It's a computer, computers do numbers, not words :)

If this isn't enough (it kinda should):
Say you added the type as string instead of the ID. You have a few places you know (and some you forgot) referencing to the 'cyan' items. One year later, you think 'Teal' would've been better so you update the table. Now you will have to look all code to change all 'Cyan' to 'Teal'. In a big enviroment this will fail.
The likelyhood of an ID changing is a lot less.

Thorsten Kettner · Accepted Answer · 2014-10-28T10:29:01.937

There are two different ways on how to design tables:

Use the natural keys, i.e. an employee number, a country code, etc.

   create table country (code char(2), name varchar(100), ...);
   create table employee (empno number(5), name varchar(100), ...);
   create table order (orderno number(5), country_code char(2), ...);

Use technical IDs:

   create table country (id number(9), code char(2), name varchar(100), ...);
   create table employee (id number(9), empno number(5), name varchar(100), ...);
   create table order (id number(9), orderno number(5), id_country number(9), ...);

In any case you would not use the name in your program. This is just data you show to the user. You don't use it to access records.

As to the technical ID: These are only meant for references inside the database. You would only use them in your program when what you do is about joins. For instance: Let the user select a country from a list, then use its ID to access the orders placed in that country.

When it comes to having your program know codes, then you shouldn't use either. For instance when you want to treat Great Britain different from other countries, because it is your home country, then use its code 'GB'. Of course you can have your program select the ID for country 'GB' and compare your orders with that ID. Only never have something like select ... from orders where id_country = 187 in your app.

As to your table: countries as in my example already have a code; you can use an ISO code. Your item types probably don't. So you invent a code. This can be a code you even show to users, so they may get used to them after some time and start talking of RCs rather than of racing cars as before. Or you keep the codes from the users and only use them inside programs, so they never see the code 'RC', but for all your programs racing cars are RCs.

So you either have

create table item_type (code char(2), name varchar(100), ...);

or

create table item_type (id number(9), code char(2), name varchar(100), ...);

and use the code field in your app regardless.

Just an additional remark: When using natural keys and having them used by users, you would usually use short codes as 'RC' because these are used for references (foreign keys) and are also easy to type in. When using IDs and only using the codes internally, you could also use long codes, such as 'RACING_CARS' for readability in your programs.

So I might have: `id` = 1, `code` = 'monthly', `description` = 'Monthly'; `id` = 2, `code` = 'month-to-date', `description` = 'Month-to-Date', etc, and check against the `code` property? — Brandon, Oct 28 '14 at 21:17
Yes. The code is a short constant you use in your programs. The description can be changed anytime (and maybe you even switch to multiple languages someday). The ID is used in other tables as a reference. — Thorsten Kettner, Oct 28 '14 at 22:31

With a reference table, should I use the ID or the text field?

3 Answers3