You're right to question this requirement, since apparently your colleague didn't explain clearly why it was so important.
The fact is there is no rule in database normalization that requires that we use a pseudokey, that is an integer like your example of Type_id
.
It's a good idea to create a lookup table of User_type
so you have a concise list of the user types allowed by your system. If you add a new type, you only need to INSERT a row to that lookup table, and then the Users
table that references the lookup table will be allowed to use the new type.
But there's no reason dictated by rules of normalization that the reference has to be to an integer. You can make the primary key of that lookup table a string.
CREATE TABLE User_type (
Type VARCHAR(10) NOT NULL PRIMARY KEY
);
CREATE TABLE Users (
...
Type VARCHAR(10) NOT NULL,
FOREIGN KEY (Type) REFERENCES User_type(Type)
);
So why did your "someone" tell you that you have to use an integer Type_id
?
Well, there could be reasons, but they're not due to rules of normalization.
- INTEGER may be smaller than the string stored in VARCHAR(10), so the data will take up less space. In MySQL, an INTEGER is four bytes. A VARCHAR is variable-length, depending on the string you store. For example "Admin" takes five bytes, plus one byte to encode the length of the string. So it's at least 50% larger than an INTEGER, and longer strings take even more space. But this is probably not worth optimizing, unless you store hundreds of millions of users.
- Comparing integer to integer is slightly faster than comparing strings (especially if the string has a character set other than binary), but this is premature optimization. Are you going to run these comparisons so frequently that you need to shave off a few nanoseconds? Maybe, but you haven't said so.