1

Lets say there are warehouses each storing items of a specific type.

So there are tables with fields

  • Warehouse - ID,Name,Type
  • Item - ID,Name,Type
  • WarehouseItem - Warehouse, Item
  • Type - ID, Name

The question is - given that a Warehouse only holds Items with of specific Type, what database normalization rule is this breaking?

Is this database normalized?

(The problem's example is made up, but I basically have this problem in real life.)

  • 1
    in this case, more than breaking one of the rules for normalisation, i'd say your schema is implementing a different database than your requirements. (if a warehouse will only contain items of the same type, the type goes in relation with the warehouse). – guido Jan 25 '16 at 15:05
  • Not exactly, I think. A warehouse doesn't contain all the items of a given type. You are probably correct that there should be a Type table and a Warehouse.Type field. But the Warehouse still contains only some of those items. Would not mark as resolved yet. I will edit to clarify. – Joonatan Talviste Jan 25 '16 at 15:11
  • I did not say to make to make the warehouse.type field unique, only that if the relation exists, it must be materialized. Multiple warehouses would still be able to have the same item type (but only one type each). – guido Jan 25 '16 at 15:36
  • I will add the relation, but I guess the question still remains. Is it normalized? I will edit again. Please add an answer if you have one. (Why do you comment instead?) – Joonatan Talviste Jan 25 '16 at 15:48

1 Answers1

1

I'm making some assumptions from just looking at your metadata without any data examples, but on first glance it appears that your schema for the most part is normalized. Technically speaking your table is 3NF (which should be your target) if it meets all of the following standards:

  • It is also 1NF - Each entry only contains atomic data (or a single piece of info)
  • It is also 2NF - No candidate key dependency meaning that when you have have a composite primary key (a key made up of more than one column) that all data is dependent on the entire key
  • It is 3NF - No transitive dependency meaning all data is only dependent on the primary key and not some other column in the table

Note that there are also higher normalized forms but they are mostly academic as you begin experiencing performance degradation the more you normalize

Given this definition:

  • Warehouse appears 3NF assuming that each warehouse can only have one Type. If not then you would be failing the transitive dependency and would need to move Type information to a new table.
  • Item too appears 3NF assuming only one Type can be assigned
  • Type appears to contain redundant data and should be removed unless of course you have a many-to-many relationship between Type and Warehouse and/or Item. In that case, you would want to introduce a bridge-entity (aka composite entry) between Type and Warehouse or Item to create two 1-to-many relationships.
  • Lastly, if I'm reading this correctly, WarehouseItem appears to be a bridge-entity between Warehouse and Item to break up the many-to-many relationship between them. If this is correct, you should be able to argue that this table is 3NF assuming the combination of Warehouse and Item represent a composite key.

So assuming I interpreted your schema correctly, once you eliminate the redundant Type table, then yes I would say this setup technically meets 3NF. Note that your requirement that

given that a Warehouse only holds Items with of specific Type

may require you introduce a new type field which will mean you need to reevaluate your normalization of that table. If you have two distinct types (a WarehouseType and an ItemType) then you may need to keep that Type table after all and turn it into a mapping table between those two new fields. But I'd need to see data examples to better evaluate.

  • The problem is that the relations don't prevent a Warehouse from having Items of multiple Types. I thought this was a sign that it's not in some normal form. But you seem to imply that the problem can't be solved with normalization. Marked as correct answer. – Joonatan Talviste Jan 25 '16 at 18:05
  • That should be relatively easily fixed by adding a constraint to your WarehouseItem table to ensure a match must exist. MySQL handles constraints a little differently then other databases but you can find more info here: http://dev.mysql.com/doc/refman/5.7/en/constraints.html –  Jan 25 '16 at 18:28
  • 1
    "Note that there are also higher normalized forms but they are mostly academic as you begin experiencing performance degradation the more you normalize" - This is untrue and dangerous to be telling people. I've often seen that normalized schemas perform *better* than denormalized ones in certain query engines and patterns. https://www.simple-talk.com/blogs/2008/07/21/the-myth-of-over-normalization/ – N West Jan 25 '16 at 19:02
  • @NWest Interesting article... I'd be interested to see data quantifying his proposal. Still, "untrue and dangerous" is a stretch... Mr. Davis is bucking industry and academic precedent. If he is right then more power to him... but, pragmatically, I think you are going to have a hard time finding customers who will justify the development cost for increased normalization on a database. 3NF is still a worthy endeavor and sadly a rarity in the real world. –  Jan 25 '16 at 19:25
  • @NWest Something else worth mentioning... it's not just database performance you have to take into account but also the ability of a developer or BI user to interact with the tables. There is an inverse relationship of ease-of-use with normalization so in saving 20 seconds querying time you might be adding hours to future development proposals. Still your point is well taken... I'll have to further research myself. –  Jan 25 '16 at 19:30
  • Example here. http://stackoverflow.com/questions/4394183/should-olap-databases-be-denormalized-for-read-performance/4731664#4731664 – N West Jan 26 '16 at 00:38