3

I have a table which represent an object that has multiple "status" (approved, expired, trending, real). All these status are independent, so i can't have like a single status field because queries can be made so you can ask:

select * from Objects WHERE approved is true, trending is true, expired is false;

I'm not sure if there is anyway to normalize such table. Also, from what i understand, using indexes for those fields would be basically useless. Is that right for this case also?

Thanks!

acadavid
  • 506
  • 6
  • 14
  • Please search for related questions. This has been asked before. – S.Lott Apr 20 '11 at 01:28
  • 1
    If each status is independent why do you think that the table is not normalized? Is there something else you're worried about such as storage size? If you're using SQL Server then your boolean fields get bit-packed into the smallest amount of storage necessary. Not sure about other RDBMSs. – Paul Sasik Apr 20 '11 at 01:34
  • Yea, i think you are right, if they are all independent then they are normalized. I'm just worried about the performance when doing queries with all those fields. @S.Lott I couldn't find a topic similar to this :( – acadavid Apr 20 '11 at 01:40
  • Don't sweat the number of fields in the query, the query optimizer symbolizes it all and the quantity has almost no effect on performance. – Matthew Frederick Apr 20 '11 at 01:52
  • 2
    And please don't try to optimize space by using a bitmap for your flags; that won't speed things up but it will probably make your database hate you and you want to be friends with your database. – mu is too short Apr 20 '11 at 02:02
  • Possible duplicate of [git: squash/fixup earlier commit](https://stackoverflow.com/questions/204461/git-squash-fixup-earlier-commit) – philipxy May 08 '18 at 23:40

1 Answers1

6

The argument against lots of boolean status columns isn't due to space or even normalization per se. If each column relates entirely to the key and does not consist of an amalgam of values, it is normalized enough (3rd normal). Rather, the issue is with the boolean nature of the columns and their lack of expansion. For example, you showed us a column called approved. What about the date it was approved and by whom? What happens if the approval pipeline expands to include more states than just approved/not approved such as initiated, pending, approved and rejected? What about the description for the approval status? Today it might Y/N but tomorrow your customers might want Approved/Rejected or perhaps Qui/Non.

I'm not suggesting you turn your design on its head; boolean columns have their place. However, they should be used after careful consideration about expansion and extensibility.

Thomas
  • 63,911
  • 12
  • 95
  • 141
  • 1
    True/False always becomes True/False/Null. And then we have problems identifying which null -- not applicable, not available, etc., etc. – S.Lott Apr 20 '11 at 10:51