10

I have two tables, SystemVariables and VariableOptions. SystemVariables should be self-explanatory, and VariableOptions contains all of the possible choices for all of the variables.

VariableOptions has a foreign key, variable_id, which states which variable it is an option for. SystemVariables has a foreign key, choice_id, which states which option is the currently selected one.

I've gotten around the circular relationship using use_alter on choice_id, and post_update on SystemVariables' choice relationship. However, I would like to add an extra database constraint that will ensure that choice_id is valid (i.e. it's referring to an option that is referring back to it).

The logic I need, assuming that sysVar represents a row in the SystemVariables table, is basically:

VariableOptions[sysVar.choice_id].variable_id == sysVar.id

But I don't know how to construct this kind of constraint using SQL, declarative, or any other method. If necessary I could just validate this at the application level, but I'd like to have it at the database level if possible. I'm using Postgres 9.1.

Is this possible?

Cam Jackson
  • 11,860
  • 8
  • 45
  • 78
  • I think you should remove the `[python]` tag and add the `[database-design]` one. – ypercubeᵀᴹ Dec 06 '11 at 23:13
  • 1
    Once I get this working with SQLAlchemy, I'm going to post the code as an answer, so I'll leave the `[python]` tag for other SQLAlchemy users. I'll get rid of `[declarative]` instead. – Cam Jackson Dec 06 '11 at 23:24
  • For future readers: See [Erwin's answer](http://stackoverflow.com/a/8395021/665488) for the SQL solution, [see my answer](http://stackoverflow.com/a/8408659/665488) for the same thing accomplished with SQLALchemy. – Cam Jackson Dec 07 '11 at 00:01

3 Answers3

13

You can implement that without dirty tricks. Just extend the foreign key referencing the chosen option to include variable_id in addition to choice_id.

Here is a working demo. Temporary tables, so you can easily play with it:

CREATE TABLE systemvariables (
  variable_id int PRIMARY KEY
, choice_id   int
, variable    text
);
   
INSERT INTO systemvariables(variable_id, variable) VALUES
  (1, 'var1')
, (2, 'var2')
, (3, 'var3')
;

CREATE TABLE variableoptions (
  option_id   int PRIMARY KEY
, variable_id int REFERENCES systemvariables ON UPDATE CASCADE ON DELETE CASCADE
, option      text
, UNIQUE (option_id, variable_id)  -- needed for the FK
);

ALTER TABLE systemvariables
  ADD CONSTRAINT systemvariables_choice_id_fk
  FOREIGN KEY (choice_id, variable_id) REFERENCES variableoptions(option_id, variable_id);

INSERT INTO variableoptions  VALUES
  (1, 'var1_op1', 1)
, (2, 'var1_op2', 1)
, (3, 'var1_op3', 1)
, (4, 'var2_op1', 2)
, (5, 'var2_op2', 2)
, (6, 'var3_op1', 3)
;

Choosing an associated option is allowed:

UPDATE systemvariables SET choice_id = 2 WHERE variable_id = 1;
UPDATE systemvariables SET choice_id = 5 WHERE variable_id = 2;
UPDATE systemvariables SET choice_id = 6 WHERE variable_id = 3;

But there is no getting out of line:

UPDATE systemvariables SET choice_id = 7 WHERE variable_id = 3;
UPDATE systemvariables SET choice_id = 4 WHERE variable_id = 1;
ERROR:  insert or update on table "systemvariables" violates foreign key constraint "systemvariables_choice_id_fk"
DETAIL: Key (choice_id,variable_id)=(4,1) is not present in table "variableoptions".

Exactly what you wanted.

All key columns NOT NULL

I think I found a better solution in this later answer:

Addressing the @ypercube's question in the comments, to avoid entries with unknown association make all key columns NOT NULL, including foreign keys.

The circular dependency would normally make that impossible. It's the classical chicken-egg problem: one of both has to be there first to spawn the other. But nature found a way around it, and so did Postgres: deferrable foreign key constraints.

CREATE TABLE systemvariables (
  variable_id int PRIMARY KEY
, variable    text
, choice_id   int NOT NULL
);

CREATE TABLE variableoptions (
  option_id   int PRIMARY KEY
, option      text
, variable_id int NOT NULL REFERENCES systemvariables
     ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
, UNIQUE (option_id, variable_id) -- needed for the foreign key
);

ALTER TABLE systemvariables
ADD CONSTRAINT systemvariables_choice_id_fk FOREIGN KEY (choice_id, variable_id)
   REFERENCES variableoptions(option_id, variable_id) DEFERRABLE INITIALLY DEFERRED; -- no CASCADING here!

New variables and associated options have to be inserted in the same transaction:

BEGIN;

INSERT INTO systemvariables (variable_id, variable, choice_id)
VALUES
  (1, 'var1', 2)
, (2, 'var2', 5)
, (3, 'var3', 6);

INSERT INTO variableoptions (option_id, option, variable_id)
VALUES
  (1, 'var1_op1', 1)
, (2, 'var1_op2', 1)
, (3, 'var1_op3', 1)
, (4, 'var2_op1', 2)
, (5, 'var2_op2', 2)
, (6, 'var3_op1', 3);

END;

The NOT NULL constraint cannot be deferred, it is enforced immediately. But the foreign key constraint can, because we defined it that way. It is checked at the end of the transaction, which avoids the chicken-egg problem.

In this edited scenario, both foreign keys are deferred. You can enter variables and options in arbitrary sequence.
You can even make it work with plain non-deferrable FK constraint if you enter related entries in both table in one statement using CTEs as detailed in the linked answer.

You may have noticed that the first foreign key constraint has no CASCADE modifier. (It wouldn't make sense to allow changes to variableoptions.variable_id to cascade back.

On the other hand, the second foreign key has a CASCADE modifier and is defined DEFERRABLE nonetheless. This carries some limitations. The manual:

Referential actions other than the NO ACTION check cannot be deferred, even if the constraint is declared deferrable.

NO ACTION is the default.

So, referential integrity checks on INSERT are deferred, but the declared cascading actions on DELETE and UPDATE are not. The following is not permitted in PostgreSQL 9.0 or later because constraints are enforced after each statement:

UPDATE option SET var_id = 4 WHERE var_id = 5;
DELETE FROM var WHERE var_id = 5;

Details:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Oh cool, that will work great! Now I just have to go look up compound foreign keys in SQLAlchemy and I'm set :) Thanks! – Cam Jackson Dec 06 '11 at 05:15
  • 2
    @Erwin: Can all ids be defined as `NOT NULL` in this scenario? – ypercubeᵀᴹ Dec 06 '11 at 22:28
  • 1
    @yppercube: Excellent question. You can easily define `variableoptions.variable_id` as `NOT NULL`. Forces you to always enter variables *before* you can enter associated options. You *can* also define `systemvariables.choice_id` as `NOT NULL` but this requires additional measures. See my amended answer for that. – Erwin Brandstetter Dec 06 '11 at 22:57
  • 1
    Nice! Deferrable constraints is a (twisted) but valid solution, indeed. – ypercubeᵀᴹ Dec 06 '11 at 23:21
  • Lots of useful information here :) I've successfully implemented your SQL using the declarative method of SQLAlchemy's ORM. I'll add the code as (yet another) answer to this question in a moment. – Cam Jackson Dec 06 '11 at 23:49
  • What an answer! I had to cross the picket line for this! Do you think you could answer this? http://dba.stackexchange.com/questions/58949/circular-foreign-key-deferrable-cascade-behavior –  Feb 14 '14 at 17:55
4

EDIT: The 0.7.4 release of SQLAlchemy (released the same day I started asking about this issue, 7/12/'11!), contains a new autoincrement value for primary keys that are also part of foreign keys, ignore_fk. The documentation has also been expanded to include a good example of what I was originally trying to accomplish.

All is now explained well here.

If you want to see the code I came up with before the above release, check the revision history of this answer.

Cam Jackson
  • 11,860
  • 8
  • 45
  • 78
2

I really do not like circular references. There is usually a way to avoid them. Here is an approach:

SystemVariables 
---------------
  variable_id 
  PRIMARY KEY (variable_id)


VariableOptions 
---------------
  option_id 
  variable_id 
  PRIMARY KEY (option_id)
  UNIQUE KEY (variable_id, option_id) 
  FOREIGN KEY (variable_id) 
    REFERENCES SystemVariables(variable_id)


CurrentOptions
--------------
  variable_id 
  option_id 
  PRIMARY KEY (variable_id)
  FOREIGN KEY (variable_id, option_id)
    REFERENCES VariableOptions(variable_id, option_id)
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
  • The circular reference doesn't really bother me, and I'd rather have the relationship contained nicely in just two tables. Either way, the compound foreign key is the important part, so yours and Erwin's answers are both valid :) – Cam Jackson Dec 06 '11 at 22:51
  • If you are happy with having some of your columns - like the `systemvariables.choice_id` - defined as `NULL`, then the two approaches are almost the same. – ypercubeᵀᴹ Dec 06 '11 at 23:09
  • What if it's easier to delete off of a child, and the information is only written through a transation thus preventing writes where some data is missing, null? http://dba.stackexchange.com/questions/58949/circular-foreign-key-deferrable-cascade-behavior –  Feb 14 '14 at 22:36