12

I have a database set up with many tables and it all looks good apart from one bit...

Inventory Table <*-----1> Storage Table <1-----1> Van Table
                              ^
                              1
                              |-------1> Warehouse Table

The Storage table is used since the Van and Warehouse table are similar but how do I create a relationship between Storage and Warehouse/Van tables? It would make sense they need to be 1 to 1 as a Storage object can only be 1 Storage place and type. I did have the Van/Warehouse table link to the StorageId primary key and then add a constraint to make sure the Van and Warehouse tables dont have the same StorageId, but this seems like it could be done a better way.

I can see several ways of doing this but they all seem wrong, so any help would be good!

Luckyl337
  • 315
  • 3
  • 12
  • Please evaluate thoroughly if you really need a one to one relationship. Most of the time when you think you need one, you really don't. – Daniel Hilgarth Feb 07 '12 at 10:00
  • Use EER. The key here is to identify the relationship type `is a`, `has`/`belongs to`, or they are the subtypes of the same supertype. Then reduce your entities to tables. – Oybek Feb 07 '12 at 10:01
  • Ahh ok, well then Storage Table is a superclass and Van/Warehouse Tables are subtypes, but now what? lol – Luckyl337 Feb 07 '12 at 10:09
  • If you want to make your subtypes mutually exclusive and if they belong to different tables, I guess, there is no a straightforward way. – Oybek Feb 07 '12 at 10:20

3 Answers3

30

You are using the inheritance (also known in entity-relationship modeling as "subclass" or "category"). In general, there are 3 ways to represent it in the database:

  1. "All classes in one table": Have just one table "covering" the parent and all child classes (i.e. with all parent and child columns), with a CHECK constraint to ensure the right subset of fields is non-NULL (i.e. two different children do not "mix").
  2. "Concrete class per table": Have a different table for each child, but no parent table. This requires parent's relationships (in your case Inventory <- Storage) to be repeated in all children.
  3. "Class per table": Having a parent table and a separate table for each child, which is what you are trying to do. This is cleanest, but can cost some performance (mostly when modifying data, not so much when querying because you can join directly from child and skip the parent).

I usually prefer the 3rd approach, but enforce both the presence and the exclusivity of a child at the application level. Enforcing both at the database level is a bit cumbersome, but can be done if the DBMS supports deferred constraints. For example:

enter image description here

CHECK (
    (
        (VAN_ID IS NOT NULL AND VAN_ID = STORAGE_ID)
        AND WAREHOUSE_ID IS NULL
    )
    OR (
        VAN_ID IS NULL
        AND (WAREHOUSE_ID IS NOT NULL AND WAREHOUSE_ID = STORAGE_ID)
    )
)

This will enforce both the exclusivity (due to the CHECK) and the presence (due to the combination of CHECK and FK1/FK2) of the child.

Unfortunately, MS SQL Server does not support deferred constraints, but you may be able to "hide" the whole operation behind stored procedures and forbid clients from modifying the tables directly.


Just the exclusivity can be enforced without deferred constraints:

enter image description here

The STORAGE_TYPE is a type discriminator, usually an integer to save space (in the example above, 0 and 1 are "known" to your application and interpreted accordingly).

The VAN.STORAGE_TYPE and WAREHOUSE.STORAGE_TYPE can be computed (aka. "calculated") columns to save storage and avoid the need for the CHECKs.

--- EDIT ---

Computed columns would work under SQL Server like this:

CREATE TABLE STORAGE (
    STORAGE_ID int PRIMARY KEY,
    STORAGE_TYPE tinyint NOT NULL,
    UNIQUE (STORAGE_ID, STORAGE_TYPE)
);

CREATE TABLE VAN (
    STORAGE_ID int PRIMARY KEY,
    STORAGE_TYPE AS CAST(0 as tinyint) PERSISTED,
    FOREIGN KEY (STORAGE_ID, STORAGE_TYPE) REFERENCES STORAGE(STORAGE_ID, STORAGE_TYPE)
);

CREATE TABLE WAREHOUSE (
    STORAGE_ID int PRIMARY KEY,
    STORAGE_TYPE AS CAST(1 as tinyint) PERSISTED,
    FOREIGN KEY (STORAGE_ID, STORAGE_TYPE) REFERENCES STORAGE(STORAGE_ID, STORAGE_TYPE)
);

-- We can make a new van.
INSERT INTO STORAGE VALUES (100, 0);
INSERT INTO VAN VALUES (100);

-- But we cannot make it a warehouse too.
INSERT INTO WAREHOUSE VALUES (100);
-- Msg 547, Level 16, State 0, Line 24
-- The INSERT statement conflicted with the FOREIGN KEY constraint "FK__WAREHOUSE__695C9DA1". The conflict occurred in database "master", table "dbo.STORAGE".

Unfortunately, SQL Server requires for a computed column which is used in a foreign key to be PERSISTED. Other databases may not have this limitation (e.g. Oracle's virtual columns), which can save some storage space.

Community
  • 1
  • 1
Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • Hi @Branko, thanks for the great answer. I was, however, confused by the last few sentences of your response. Would you mind elaborating on: (1) why, in your last diagram, the VAN and WAREHOUSE tables have a STORAGE_TYPE column (I imagine, all with the same values: 0 for VAN and 1 for WAREHOUSE), (2) what exactly is supposed to go in the content of the computed/calculated columns (I'm confused as to what's being calculated here)? Much obliged :) – youngrrrr Mar 08 '16 at 06:11
  • 2
    @youngrrrr VAN and WAREHOUSE have a STORAGE_TYPE so they can declare a CHECK on it. That way, if a STORAGE row is a parent of a VAN row, it _cannot_ also be a parent of a WAREHOUSE row (because STORAGE.STORAGE_TYPE is 0, failing the CHECK in WAREHOUSE). The opposite is also true: if STORAGE is WAREHOUSE, the STORAGE TYPE is 1, which means the CHECK in VAN would fail if somebody attempted to insert a corresponding row there. The computed column is simply to save space - since it always has the same value within given table, there is no need to physically repeat that value in every row. – Branko Dimitrijevic Mar 08 '16 at 13:04
  • 1
    @youngrrrr Please note that there is a FK on VAN(STORAGE_ID, STORAGE) referencing STORAGE(STORAGE_ID, STORAGE_TYPE). And same on WAREHOUSE. That way, paren row (in STORAGE) and child row (in VAN or WAREHOUSE, but not both) always have the same STORAGE_TYPE. – Branko Dimitrijevic Mar 08 '16 at 13:08
  • Hey @Branko, thanks for taking the time to reply to my questions. One thing I'm still unclear about is this: How does using a computed column save space? It still seems we are physically repeating the STORAGE_TYPE in every row. Forgive me; I don't have a lot of familiarity with computed columns and using Google for an explanation did not help. Thanks again for your responses!!! – youngrrrr Mar 08 '16 at 18:37
  • 1
    @youngrrrr No problem! To answer your question: Computed columns are calculated on-the-fly, instead of being physically stored in the database (unless PERSISTED). Not being stored in the database = saved space. Whether you can actually define a FK on a non-persisted computed column is DBMS-specific. Unfortunately, SQL Server requires persisted computed columns for FKs, so my conjecture about "saving space" is actually wrong under SQL Server (see the edit). OTOH, I believe Oracle's equivalent of computed columns (so-called virtual columns), can be used on FKs, without being persisted. – Branko Dimitrijevic Mar 08 '16 at 19:44
1

As you say, there are many solutions. I would recommend starting with the simplest solution, then optimising later if performance or storage become problems. The simplest solution (but not optimal in terms of storage) would be to have a Storage table that has a column for storage type (indicating whether the row represents a van or a warehouse), plus columns for Van attributes as well as Warehouse attributes. In a row that represents a Van, the columns for the Warehouse attributes will all be null. In a row that represents a Warehouse, the columns for the Van attributes will all be null.

That way, you cut down on the number of tables, and keep your queries nice and simple. Be prepared to revisit your decision if storage becomes tight.

Dawood ibn Kareem
  • 77,785
  • 15
  • 98
  • 110
1

Somehow seems to me that inventory-items may change locations, so I would go with something like this.

enter image description here

Damir Sudarevic
  • 21,891
  • 3
  • 47
  • 71