How to create an alternate key from multiple nullable columns

Question

I have a database with three tables, call them Sale, Store and Hub. Each instance of Store references an instance of Hub, although there may be instances of Hub not referenced in the Store table. In most cases, an instance of Sale will reference an instance of Store (and therefore can be mapped to an instance of Hub.) However in some cases a Sale will reference the Hub table directly.

This relationship models the following: stores are stocked by hubs (stores and hubs are both locations) and while most sales occur at stores, they can also occur at the hubs.

In a simplified version where the Sale-Hub relationship doesn't exist, I can easily make StoreID part of the primary key for the Sale table, which is what I'd like to do here, except that StoreID will occasionally be null (when the sale occurs at a hub and not at a store.)

Therefore a sample from the Sale table might look like this:

SaleID    SaleDate    StoreID    HubID    Quantity
     1  2012-05-16          1     NULL         100
     2  2014-09-27          2     NULL          99
     3  2018-02-01       NULL        9          30
     4  2019-11-12         17     NULL         207

The Store and Hub tables will include location info along with some store and hub specific fields.

The issue is how to handle creating a good alternate key for the Sale table (SaleID is my surrogate primary key.) In theory a sale should be determined by date and location. However I have two location fields and they are both nullable. I'm wondering if I can handle this using unique indexes? Or can I use check constraints somehow? Or an intermediate location table?

Also, as a follow up, suppose I also had a SuperHub table (imagine Hub is to SuperHub what Store is to Hub.) If I wanted to add a SuperHubID field to my Sale table, how would this affect things? With 3+ nullable fields don't check constraints/unique indexes start getting out of hand?

Here is potentially what an ERD for this could look like:

Edit: To clarify, I'm trying to figure out the best way to include location info as part of the alternate key for the Sale table, given that "location info" is split among several different tables (the primary key for the Sale table is just a surrogate key SaleID.) Trying to include all the location tables in the alternate key leads to Nulls appearing in the key (see sample from Sale table above.) What is the best way to deal with this issue?

Edit 2: Ok, here is some more clarification. I'm including table definitions, etc.

--Table Defs
CREATE TABLE Sale (
SaleID int NOT NULL IDENTITY(1, 1),
SaleDate date NOT NULL,
StoreID int NULL,
HubID int NULL,
Quantity int NOT NULL);
-- StoreID and HubID are nullable since a sale can occur at one or the other
-- Is there a better way to organize this?
-- @PeterHe mentioned Category/Subcategory model

CREATE TABLE Store (
StoreID int NOT NULL IDENTITY(1, 1),
StoreNumber int NOT NULL,
Customer nvarchar(20) NOT NULL,
Address nvarchar(50) NOT NULL,
HubID int NOT NULL);

CREATE TABLE Hub (
HubID int NOT NULL IDENTITY(1, 1),
HubNumber int NOT NULL,
Customer nvarchar(20) NOT NULL,
Address nvarchar(50) NOT NULL);

--PKs
ALTER TABLE Sale 
    ADD CONSTRAINT PK_Sale PRIMARY KEY CLUSTERED (SaleID);

ALTER TABLE Store 
    ADD CONSTRAINT PK_Store PRIMARY KEY CLUSTERED (StoreID);

ALTER TABLE Hub  
    ADD CONSTRAINT PK_Hub PRIMARY KEY CLUSTERED (HubID);

--FKs
ALTER TABLE Sale 
    ADD CONSTRAINT FK_Sale_Store 
        FOREIGN KEY (StoreID) REFERENCES Store (StoreID)
            ON UPDATE NO ACTION
            ON DELETE NO ACTION;

ALTER TABLE Sale 
    ADD CONSTRAINT FK_Sale_Hub 
        FOREIGN KEY (HubID) REFERENCES Hub (HubID)
            ON UPDATE NO ACTION
            ON DELETE NO ACTION;

ALTER TABLE Store 
    ADD CONSTRAINT FK_Store_Hub 
        FOREIGN KEY (HubID) REFERENCES Hub (HubID)
            ON UPDATE NO ACTION
            ON DELETE NO ACTION;

--AKs
ALTER TABLE Store
    ADD CONSTRAINT AK_Store UNIQUE (StoreNumber, Customer);

ALTER TABLE Hub
    ADD CONSTRAINT AK_Hub UNIQUE (HubNumber, Customer);

ALTER TABLE Sale
    ADD CONSTRAINT AK_Sale UNIQUE (SaleDate, StoreID, HubID); 
    --issue here is StoreID and HubID are nullable since sale could occur at either

Not really sure what the question is here. It seems like you want to figure out what the primary key should be for Sale. But then you also talk about check constraints. Why not make StoreID, HubID and SuperHubID as foreign keys? Then if only one of them is allowed use a check constraint for that. — Sean Lange, Oct 28 '19 at 16:53
@Sean Sorry, yes they are foreign keys. But I need one of them to be present so I can distinguish sales by location. Ideally there would be only one location table and then there could be a not null LocationID field in the Sale table. Unfortunately there are multiple types of locations and I'd like to separate them into different tables (especially since a store should reference a hub.) — ahh_real_numbers, Oct 28 '19 at 17:21
This is the same unclear question as [your recent other](https://stackoverflow.com/q/58564857/3404097) only more complicated. Please don't repost, edit (when not invalidating resonable answers). Please act here on all its comments. Give exact table meanings & all constraints. Please use DDL not English or an ERD for what DDL can say. Clarify via edits. Etc. Again: *Forget FKs* & give clear table meanings & preferably simple ones & hence preferably null-free ones that you can then either combine to a PK- & FK-friendly design or this one, which isn't. That's how a good answer must start. — philipxy, Oct 28 '19 at 23:27
Beware: In standard SQL 2 rows are UNIQUE/non-duplicates when/iff for some column they're not SQL-=. But in SQL Server "only one null value is allowed per column" for UNIQUE. SQL PK means UNIQUE NOT NULL, whatever UNIQUE means. But "alternate key" has no specific meaning. Presumably you know a SQL FK constraint is satisfied for a referencing subrow with NULL. But also a referenced subrow with NULL cannot be matched to a FK. So when NULLs appear please instead of UNIQUE/AK/FK say what exact constraints/conditions hold. PS Please don't insert EDITs/UPDATEs, rewrite to a best presentation. — philipxy, Oct 29 '19 at 00:28
Does this answer your question? [How can you represent inheritance in a database?](https://stackoverflow.com/questions/3579079/how-can-you-represent-inheritance-in-a-database) — philipxy, Oct 30 '19 at 00:34
This [link](http://www.sqlteam.com/article/implementing-table-inheritance-in-sql-server) that @Andrew provided helped, thanks. — ahh_real_numbers, Oct 30 '19 at 19:53

score 1 · Answer 1 · answered Oct 28 '19 at 17:44

1

It better to define an entity to includes all the organizations that can make a sale, e.g. sale_org ( sale_org_id int not null, sale_org_type tinyint nor null, -- 1, store; 2, hub, 3, super hub address ... ) You can still have Store and Hub etc. tables (without address) to define the relations. Or just one table for the hierarchy sale_org_hierarchy to define the relation: sale_org_id and parent_sale_org_id

Sale table only reference store_hierarchy table.

answered Oct 28 '19 at 17:44

PeterHe

2,766
1
8
7

If you can not change the data model, you can create a unique index on (store_id, hub_id,date) using filtered index: WHERE store_id is not null and hub_id is not null – PeterHe Oct 28 '19 at 17:57
@ PeterHe The above makes sense, but if I have a SaleOrg table that contains address info am I not storing the same info twice in separate places? Since these tables don't talk to each other isn't there a risk that they won't be the same in these tables? And as far as creating an index, what about when StoreID is null and HubID is not null or vice versa? Would I need separate indexes for these cases? – ahh_real_numbers Oct 28 '19 at 18:29
This is the category/subCategory model. You put all the common columns of all subcategory on the category table, and remove them from the subcategory tables. In your case the SaleOrg is the category table, and from your diagram, It seems to me each subcategory (store/hub/superhub) haas an address. – PeterHe Oct 28 '19 at 18:34
so just add address to the category table only – PeterHe Oct 28 '19 at 18:35
Ok, but just to be clear, is there any relation between the SaleOrg table and the Store, Hub, SuperHub tables? – ahh_real_numbers Oct 28 '19 at 18:38
1

Yes, the subcategories are subset of the parent table with extra dedicated columns, but the ID is from the parent table. E.g. the store table will have stiore_id and hub_id columns, and store_id references the SaleOrg_Id in the SaleOrg table – PeterHe Oct 28 '19 at 20:04
I see. Ok that makes sense. So essentially I'm grouping what is common among the three location tables into one SaleOrg table. The fields that are specific to a location type remain in that location-specific table, along with a reference to the correct row in the SaleOrg table. Then in this simplified example, would the SaleOrg table have a primary key (LocationID, LocationTypeID)? I'm thinking LocationTypeID indicates which subcategory table is referenced and LocationID indicates the correct row. Should subcategory tables have a LocationTypeID field? – ahh_real_numbers Oct 28 '19 at 20:23
SaleOrg table have a primary key on LocationID only. Subcategory tables do not need LocationTypeID because it is specific. – PeterHe Oct 29 '19 at 13:09
Ok last thing, how do I know if a SaleOrg is a store or a hub? I guess to get all store information I would do something like `SELECT SaleOrg.*, Store.* FROM SaleOrg JOIN Store ON SaleOrg.LocationID = Store.LocationID JOIN Hub ON SaleOrg.LocationID = Hub.LocationID WHERE Store.LocationID IS NOT NULL`? – ahh_real_numbers Oct 29 '19 at 15:11
LocationTypeID needs to be in the SaleIOrg table – PeterHe Oct 29 '19 at 15:55

How to create an alternate key from multiple nullable columns

1 Answers1