2

In my application I have a user which has a profile and an address. The relationship between those tables are:

  • users: id, type, registered, email...
  • profiles: user_id, address_id, first_name, last_name, gender, status, etc..
  • addresses: id, city, street, house_number, apartment

Those tables have some Enum fields on them, but I think it might not be efficient at-all since I'm going to do some intensive user's lookup based on their address and profile so I thought maybe I should use reference tables instead? (I also gain the indexing with an integer which is better).

For example, In profiles I have a status enum field which gets the following values for now:

  • single
  • married
  • widowed
  • divorced

so I thought about maybe having a statuses table and a foreign key on profiles - status_id.

Another dilemma about this is should I have a reference table for a gender as-well? Currently I only accept male and female values in my enum field for gender, but maybe in the future we will want to add a transgender or anything else. I will also do an intensive user lookup based on gender of-course. Should I also extract it into a reference table?

kfirba
  • 5,231
  • 14
  • 41
  • 70
  • These are all good ideas, but be wary of overnormalising at the outset. We accept that we will have to tweak our schemas, queries and code a little bit as we go along, but the tweaks you're suggesting aren't obvious performance enhancers - although the additional functionality may prove necessary. – Strawberry Nov 26 '15 at 05:07
  • @strawberry what are the downsides of overnormalizing my database? My select queries will get slower because I will need to join more tables to the query? – kfirba Nov 26 '15 at 05:34
  • Queries get more cumbersome and maintenance gets more complicated. Personally, I hate enums, so would always normalise them out, but that's an aesthetic or emotional choice rather than a logical one. – Strawberry Nov 26 '15 at 05:41
  • @strawberry haha emotional choice. Love it. I'm also more inclined to use reference tables. I've also found a really nice answer about normalizing and denormalizing databases. The answer was - joins and smart. Properly index the fields your are joining on and you will see wonders: http://stackoverflow.com/questions/173726/when-and-why-are-database-joins-expensive – kfirba Nov 26 '15 at 05:50
  • @kfirba What did you end up doing? I've been wondering the same thing for scenarios like `gender` and `status`. Should those get their own table or just be an enum as a part of `profiles`? – atkayla Aug 24 '17 at 22:02

1 Answers1

0

Enums are internally stored as numbers. The data like gender or status in profile table doesn't get modified very often. So I personally would prefer enums. This would avoid the referencing overhead.

However, it has its own disadvantages.

Please refer to http://chateau-logic.com/content/why-we-should-not-use-enums-databases to know why not to use enums. If you are using multiple languages in your application then enums are a definite NO.

punita.gosar
  • 400
  • 3
  • 8
  • Thanks for the reply! We are indeed aiming to address multiple languages in the future. Is that enough to justify a reference table and not an enum? – kfirba Nov 26 '15 at 05:30
  • Also, when I will look for a user who is single and is a male and lives in city X it will have to run string comparison with a full table scan with enums right? If I use the reference tables, the fields will be indexed and it may prevent a full table scan if I'm looking only into indexed fields. (I may be wrong here, correct me if I am) – kfirba Nov 26 '15 at 05:32
  • As I said earlier, if you are using multiple languages, you should go for reference tables, as you might have 1 row for a specific value with different languages as column data. As for field indexing, you can index the enum fields. – punita.gosar Dec 22 '15 at 10:31