How can I enforce second-degree relationships without composite keys?

Question

Consider this database design for a multi-tenancy line-of-business web application:

A Tenant is a tenant of the web-application, a Tenant has many Shops and many Customers (Customer records are not shared between Tenants, so it's valid for multiple Customer records to refer to the same real-life human), and each Shop has many Jobs. A Job is also associated with each Customer.

There exists a problem in that there doesn't seem to be a trivial constraint solution to prevent the case where a Job's CustomerId is changed to a Customer that does not belong to the parent Tenant, thus creating invalid data.

Here is the present schema:

CREATE TABLE Tenants (
    TenantId bigint IDENTITY(1,1) PRIMARY KEY
    ...
)

CREATE TABLE Shops (
    TenantId bigint FOREIGN KEY( Tenants.TenantId ),
    ShopId   bigint IDENTITY(1,1) PRIMAREY KEY,
    ...
)

CREATE TABLE Customers (
    TenantId   bigint FOREIGN KEY( Tenants.TenantId ),
    CustomerId bigint IDENTITY(1,1) PRIMARY KEY
    ...
)

CREATE TABLE Jobs (
    ShopId bigint FOREIGN KEY( Shops.ShopId )
    JobId bigint IDENTITY(1,1) PRIMARY KEY,
    CustomerId bigint FOREIGN KEY( Customers.CustomerId )
)

Currently the only solution I can think of is to change the design to use Composite Keys that always include the parent Tenant.TenantId, which are then shared accordingly:

CREATE TABLE Shops (
    TenantId bigint,
    ShopId   bigint IDENTITY(1,1),
    ...

    PRIMARY KEY( TenantId, ShopId )
    FOREIGN KEY( TenantId REFERENCES Tenants (TenantId) )
)

CREATE TABLE Customers (
    TenantId   bigint,
    CustomerId bigint IDENTITY(1,1)
    ...

    PRIMARY KEY( TenantId, CustomerId )
    FOREIGN KEY( TenantId REFERENCES Tenants (TenantId) )
)

CREATE TABLE Jobs (
    TenantId bigint
    ShopId bigint
    JobId bigint IDENTITY(1,1),
    CustomerId bigint

    PRIMARY KEY( TenantId, ShopId, JobId )

    FOREIGN KEY( TenantId REFERENCES Tenants ( TenantId ) )
    FOREIGN KEY( TenantId, ShopId REFERENCES Shops( TenantId, ShopID ) )
    FOREIGN KEY( TenantId, CustomerId REFERENCES Customers( TenantId, CustomerId ) )
)

...seems like a bit of a hack though, with lots of redundant data - especially as IDENTITY is used anyway. Is there any way the RDBMS can test JOINs for consistency whenever data is mutated?

@philipxy MSSQL Server in most cases, sometimes PostgreSQL and MySQL. — Dai, Jul 13 '16 at 02:31

reaanb · Answer 1 · 2016-06-10T22:39:27.790

Composite foreign key constraints are perfectly valid and useful, but you don't need composite primary keys to use them! You just need composite indices in the referenced tables. The redundant TenantId in Jobs won't create a risk of update anomalies thanks to the FK constraints.

For example:

CREATE TABLE Shops (
    ShopId   bigint IDENTITY(1,1),
    TenantId bigint,
    PRIMARY KEY (ShopId),
    UNIQUE KEY (TenantId, ShopId),
    FOREIGN KEY (TenantId) REFERENCES Tenants (TenantId)
)

CREATE TABLE Customers (
    CustomerId bigint IDENTITY(1,1),
    TenantId   bigint,
    PRIMARY KEY (CustomerId),
    UNIQUE KEY (TenantId, CustomerId),
    FOREIGN KEY (TenantId) REFERENCES Tenants (TenantId)
)

CREATE TABLE Jobs (
    JobId      bigint IDENTITY(1,1),
    TenantId   bigint,
    ShopId     bigint,
    CustomerId bigint,
    PRIMARY KEY (JobId),
    FOREIGN KEY (TenantId, ShopId) REFERENCES Shops (TenantId, ShopID),
    FOREIGN KEY (TenantId, CustomerId) REFERENCES Customers (TenantId, CustomerId)
)

If you're concerned about storage space, I suggest you calculate the actual cost of that space based on realistic volumes of data and benchmark the performance differences between FK constraints vs triggers vs check constraints involving a subquery. Don't just assume an extra attribute will be inefficient.

Can you explain "Composite Indices"? How can multiple columns be included in an index if it isn't present in the parent table? — Dai, Jun 10 '16 at 20:11
I meant to separate the primary key from the foreign key reference. I hope the example clarifies. — reaanb, Jun 10 '16 at 22:18
Also see http://stackoverflow.com/questions/2292662/how-important-is-the-order-of-columns-in-indexes in regards to composite index optimization — reaanb, Jun 10 '16 at 22:46

Zohar Peled · Answer 2 · 2016-06-10T12:39:44.890

Assuming your rdbms supports check constraints then you can use a check constraint on your jobs table, to check that the customer id refers to the same tannent id as the shop id.
This way you are left with a single column primary key on each table.
Based on the create table syntax I'm guessing you are using sql-server so your check constraint would be something like this:

ALTER TABLE Jobs
    ADD CONSTRAINT chk_jobs_customer_shop   
    CHECK dbo.fnCheckCustomerAndShopRelationship(customerId, shopId) = 1

and of course, you will need to create the UDF first:

CREATE FUNCTION dbo.fnCheckCustomerAndShopRelationship 
(
    @customerId int,
    @shopId int
)
RETURNS int   
AS
BEGIN

    IF EXISTS
    (
        SELECT 1
        FROM Customers c
        INNER JOIN Shops s ON c.TenantId = s.TenantId
    )
        RETURN 1
    ELSE
        RETURN 0
END;
GO

score 0 · Answer 3 · answered Jun 10 '16 at 10:15

Your 2nd design is the straightforward declarative design for typical SQL DBMSs.

Although Standard SQL (and the relational model) allows arbitrary declarative constraints (CHECK and CREATE ASSERTION), typical SQL DBMSs unfortunately only allow declaration of superkeys (PRIMARY KEY & UNIQUE NOT NULL), foreign superkeys (FOREIGN KEY) and limited CHECKing.

The typical SQL solution for arbitrary constraint enforcement is to define triggers that evaluate expressions as necessary on INSERT, UPDATE and DELETE and functions in CHECKs. Unfortunately DBMSs don't typically evaluate constraint enforcement code in a properly atomic/serialized manner.

Your characterization of your constraint as "testing JOINs for consistency" reflects a good understanding in that in general for integrity and clarity we would like to assert arbitrary constraints on arbitrary expressions.

Any reasonable implementation of arbitrary constraints has to take advantage of "testing whenever data is mutated" to avoid reevaluating entire expressions when a perhaps a much less costly test can be done re just what has changed. This code is just what you have to unfortunately write by hand with triggers. It's just not a priority of most vendors. See Applied Mathematics for Database Professionals by Lex deHaan & Toon Koppelaars) for a good presentation of these issues and solutions.

How can I enforce second-degree relationships without composite keys?

3 Answers3

Linked