10

Let's say I have a table that represents a super class, students. And then I have N tables that represent subclasses of that object (athletes, musicians, etc). How can I express a constraint such that a student must be modeled in one (not more, not less) subclass?

Clarifications regarding comments:

  • This is being maintained manually, not through an ORM package.
  • The project this relates to sits atop SQL Server (but it would be nice to see a generic solution)
  • This may not have been the best example. There are a couple scenarios we can consider regarding subclassing, and I just happened to invent this student/athlete example.

A) In true object-oriented fashion, it's possible that the superclass can exist by itself and need not be modeled in any subclasses.

B) In real life, any object or student can have multiple roles.

C) The particular scenario I was trying to illustrate was requiring that every object be implemented in exactly one subclass. Think of the superclass as an abstract implementation, or just commonalities factored out of otherwise disparate object classes/instances.

Thanks to all for your input, especially Bill.

ConcernedOfTunbridgeWells
  • 64,444
  • 15
  • 143
  • 197
Mark Canlas
  • 9,385
  • 5
  • 41
  • 63
  • Just for a clarification: Are you manually managing this or using an ORM solution like hibernate? – Uri Dec 01 '08 at 23:38
  • What database are you using? If you are using PostgreSQL, it has table inheritance available. – Elijah Dec 01 '08 at 23:50
  • I'm curious, Why can't astudent be both an athlete and a nusician? – Charles Bretana Dec 02 '08 at 00:18
  • @Elijah - Really? Must check that out - thanks! @Charles - Good point - this does look like a bad example – Draemon Dec 02 '08 at 04:06
  • Beware of DB-specific features (like table inheritance from Postgres). It's WAY better to stay with basic relational features found in every RDBMS. Portability and code readability are a plus. – F.D.Castel Dec 24 '08 at 02:56

7 Answers7

4

Each Student record will have a SubClass column (assume for the sake of argument it's a CHAR(1)). {A = Athlete, M=musician...}

Now create your Athlete and Musician tables. They should also have a SubClass column, but there should be a check constraint hard-coding the value for the type of table they represent. For example, you should put a default of 'A' and a CHECK constraint of 'A' for the SubClass column on the Athlete table.

Link your Musician and Athlete tables to the Student table using a COMPOSITE foreign key of StudentID AND Subclass. And you're done! Go enjoy a nice cup of coffee.

CREATE TABLE Student (
    StudentID INT NOT NULL IDENTITY PRIMARY KEY,
    SubClass CHAR(1) NOT NULL,
    Name VARCHAR(200) NOT NULL,
    CONSTRAINT UQ_Student UNIQUE (StudentID, SubClass)
);

CREATE TABLE Athlete (
    StudentID INT NOT NULL PRIMARY KEY,
    SubClass CHAR(1) NOT NULL,
    Sport VARCHAR(200) NOT NULL,
    CONSTRAINT CHK_Jock CHECK (SubClass = 'A'),
    CONSTRAINT FK_Student_Athlete FOREIGN KEY (StudentID, Subclass) REFERENCES Student(StudentID, Subclass)
);

CREATE TABLE Musician (
    StudentID INT NOT NULL PRIMARY KEY,
    SubClass CHAR(1) NOT NULL,
    Instrument VARCHAR(200) NOT NULL,
    CONSTRAINT CHK_Band_Nerd CHECK (SubClass = 'M'),
    CONSTRAINT FK_Student_Musician FOREIGN KEY (StudentID, Subclass) REFERENCES Student(StudentID, Subclass)
);
Dave Markle
  • 95,573
  • 20
  • 147
  • 170
2

Here are a couple of possibilities. One is a CHECK in each table that the student_id does not appear in any of the other sister subtype tables. This is probably expensive and every time you need a new subtype, you need to modify the constraint in all the existing tables.

CREATE TABLE athletes (
  student_id INT NOT NULL PRIMARY KEY,
  FOREIGN KEY (student_id) REFERENCES students(student_id),
  CHECK (student_id NOT IN (SELECT student_id FROM musicians 
                      UNION SELECT student_id FROM slackers 
                      UNION ...)) 
);

edit: @JackPDouglas correctly points out that the above form of CHECK constraint is not supported by Microsoft SQL Server. Nor, in fact, is it valid per the SQL-99 standard to reference another table (see http://kb.askmonty.org/v/constraint_type-check-constraint).

SQL-99 defines a metadata object for multi-table constraints. This is called an ASSERTION, however I don't know any RDBMS that implements assertions.

Probably a better way is to make the primary key in the students table a compound primary key, the second column denotes a subtype. Then restrict that column in each child table to a single value corresponding to the subtype represented by the table. edit: no need to make the PK a compound key in child tables.

CREATE TABLE athletes (
  student_id INT NOT NULL PRIMARY KEY,
  student_type CHAR(4) NOT NULL CHECK (student_type = 'ATHL'),
  FOREIGN KEY (student_id, student_type) REFERENCES students(student_id, student_type)
);

Of course student_type could just as easily be an integer, I'm just showing it as a char for illustration purposes.

If you don't have support for CHECK constraints (e.g. MySQL), then you can do something similar in a trigger.

I read your followup about making sure a row exists in some subclass table for every row in the superclass table. I don't think there's a practical way to do this with SQL metadata and constraints. The only option I can suggest to meet this requirement is to use Single-Table Inheritance. Otherwise you need to rely on application code to enforce it.

edit: JackPDouglas also suggests using a design based on Class Table Inheritance. See his example or my examples of the similar technique here or here or here.

Community
  • 1
  • 1
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • 3
    -1 for misleading information about check constraints. From the [SQL Server 2008R2 docs](http://msdn.microsoft.com/en-us/library/ms174979(v=SQL.105).aspx): "The search condition must evaluate to a Boolean expression and cannot reference another table." –  Apr 04 '11 at 05:49
  • I get no credit for recommending something different subsequently? Well, anyway thanks for explaining your downvote, most people don't do that. – Bill Karwin Apr 04 '11 at 07:00
  • If your initial solution *had* worked, it would have been an attractive option. I think Jack was initially attracted to your first solution, only to be disappointed to have wasted his time since it appears to be unsupported. +1 for editing your answer :-) – Simen S Apr 04 '11 at 07:29
  • IMO misleading answers outweigh good ones. Nothing personal, I just want SO users to get good information - I've removed the downvote because you've edited your answer. I'd even upvote except I don't think your last paragraph is right ([example](http://stackoverflow.com/questions/5522320/what-is-the-best-way-to-enforce-a-subset-relationship-with-integrity-constraint/5522344#5522344)) –  Apr 04 '11 at 07:54
  • Thanks for removing the downvote. I read your answer and upvoted it. I've also included several links to answers I've given about the same solution. – Bill Karwin Apr 04 '11 at 14:48
1

If you are interested in data modeling, in addition to object modeling, I suggest you look up "relational modeling generalization specialization" on the web.

There used to be some good resources out there that explains this kind of pattern quite well.

I hope those resources are still there.

Here's a simplified view of what I hope you'll find.

Before you begin designing a database, it's useful to come up with a conceptual data model that connects the values stored in the database back to the subject matter. Making a conceptual data model is really data analysis, not database design. Sometimes it's difficult to keep analysis and design separate.

One way of modeling data at the conceptual level is the Entity-Relationship (ER) model. There are well known patterns for modeling the specialization-generalization situation. Converting those ER patterns to SQL tables (called logical design) is pretty straightforward, although you do have to make some design choices.

The case you gave of a student having possibly several roles like musician probably doesn't illustrate the case you are interested in, if I read you right. You seem to be interested in the case where the subclasses are mutually exclusive. Perhaps the case where a vehicle might be an auto, a truck, or a motorcycle might be easier to discuss.

One difference you are likely to encounter is that the general table for the superclass doesn't really need the type code column. The type of a single superclass instance can be derived by the presence or absence of foreign keys in the various subclass tables. Whether it's smarter to include or omit the type code depends on how you intend to use the data.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Walter Mitty
  • 18,205
  • 2
  • 28
  • 58
0

I would add a Check Constraint possibly.
Create the ForeignKeys as Nullable. Add a Check to make sure they aren't both null and to make sure they aren't both set. CONSTRAINT [CK_HasOneForiegnKey] CHECK ((FK_First!= NULL OR FK_Second != NULL) AND NOT (FK_First != NULL AND FK_Second != NULL)).

I am not sure but I believe this would allow you to set only one key at a time.

Michael Puckett II
  • 6,586
  • 5
  • 26
  • 46
0

interesting problem. Of course the FK constraints are there for the subtables so there has to be a student for those.

The main problem is trying to check as it is inserted. The student has to be inserted first so that you don't violate a FK constraint in a subtable so a trigger that does a check wouldn't work.

You could write an app that checks now and then if you are really concerned about this. I think the biggest fear though would be deletions. Someone could delete a subtable entry but not the student. You could have triggers to check when items are deleted from the subtables since that is probably the biggest problem.

I have a db with a table per subclass hierarchy like this as well. I use Hibernate and its mapped properly so it deletes everything automatically. If doing this by 'hand' then I would make sure to always delete the parent with proper cascades hehe :)

Arthur Thomas
  • 5,088
  • 1
  • 25
  • 31
0

Thanks, Bill. You got me thinking...

The superclass table has a subclass code column. Each of the subclass tables has a foreign key constraint, as well as one that dictates that the id exist with a subset of the superclass table (where code = athlete).

The only missing part here is that it's possible to model a superclass without a subclass. Even if you make the code column mandatory, it could just be an empty join. That can be fixed by adding a constraint that the superclass's ids exist in a union of the ids in the subclass tables. Insertion gets a little hairy with these two constraints if constraints are enforced in the middle of transactions. That or just don't worry about unsubclassed objects.

Edit: Bleh, such a good sounding idea... But impeded by the fact that subqueries that refer to other tables aren't supported. At least not in SQL Server.

Mark Canlas
  • 9,385
  • 5
  • 41
  • 63
0

That can be fixed by adding a constraint that the superclass's ids exist in a union of the ids in the subclass tables.

Depending on how much intelligence you want to put into your schema (and how much MS SQL Server lets you put there), you wouldn't actually need to do a union of the subclass tables, since you know that, if the id exists in any subclass table, it must exist in the same subclass as the one identified by the subclass code column.

Dave Sherohman
  • 45,363
  • 14
  • 64
  • 102
  • (a) you can't query another table in a constraint in MS SQL Server, and (b) you can't query a table whose name depends on a data value. – Bill Karwin Dec 02 '08 at 14:09