Database Design - How much logic via foreign keys?

Question

This is a theoretical question about database design. I could "abuse" relations as conditions in the database or I could use conventional if-conditions in the source-code. I'll try to explain my question with a example: Let's pretend we have two acting characters (pseudocode):

TABLE Human(
    int humanId PK,
    string name
);

and:

TABLE Animal(
    int animalId PK,
    string species,
    string name
);

Also, there is a set of "actions". For every action has to be defined if it is available for the given character.

TABLE Action (
    int actionId PK,
    string title
);

Example tuples would be:

Human ( 78asd7, "Bob");
Animal ( fgh7df, "Dog", "Pluto");

Action ( d6guvs, "Eat");
Action ( hj5j6f, "Check_Self_Awareness");

How would I implement that both, a human and a dog, can "Eat", but just a human can "Check_Self_Awareness"?

I see two possibilties:

A) Solve it via foreign keys: n:m tables HumanActions and AnimalActions i.e.

TABLE HumanActions (
    int humanId FK,
    int actionId FK
);

HumanActions ( 78asd7 , d6guvs);
HumanActions ( 78asd7 , hj5j6f);

TABLE AnimalActions (
    int animalId FK,
    int actionId FK
);

AnimalActions ( fgh7df , d6guvs);

B) Solve it in code

if (type of DB_Tupel == "Animal" && action.title == "Eat") {
    // continue
}
if (type of DB_Tupel == "Animal" && action.title == "Check_Self_Awareness") {
    // error: animals don't have self awareness (except for apes)
}

What would you recommend?

if the domain has FK relations , enforce them in the database. — Mitch Wheat, May 20 '15 at 14:19
A human is an animal, too. (that will solve your "smart union" pseudo-problem) — wildplasser, May 20 '15 at 14:19

score 1 · Answer 1 · edited Oct 26 '22 at 17:52

Architecture & Standards

Well, the great Dr E F Codd wrote the Relational Model in 1970; we have had genuine SQL platforms since 1984; Open Architecture since 1987, as a Standard since 1990. So the answer is not merely my recommendation, not merely best practice, but demanded by standard. Ie. demanded by organisations that expect to exist for more than a few years.

The principle is to define the data, as data, including all the rules that govern the data, in the database. That is required for integrity and consistency. It is not an "abuse". The database is a single recoverable unit. The rules and the transactions travel with the database.

How much logic via foreign keys?

It isn't "logic". It is data definition, rules that govern the data, its integrity. And there are many, many, more methods than simply via Foreign Keys.

Non-architecture & Sub-standard

"Data rules implemented in code" is not only sub-standard, it produces a filing system instead of a database, that has no integrity, that can only be accessed via the app. That is not "closed architecture", that is non-architecture. And forty five years behind the times. Most users want to access the database without being limited to the app, eg. via any of the hundreds of report tools.

But the post-Codd theoreticians, and the OO/ORM proponents, write books on how to implement primitive, pre-relational, pre-1970's ISAM Record Filing Systems, plus massive stacks of objects (towers, really) that have none of the integrity, speed, or power of post-Codd Relational Databases, and to label them "relational". That is all that they know, that is all that they can teach.

The consequence is, since the RFS can't do anything (can't do this; can't do that; can't support hierarchies; etc), and since the RFS is labelled "relational database", they think Relational Databases can't do this, that, and the other thing.

So if you want a Relational Database, circa 1970 in defined terms, circa 1984 in implementation terms, then implement the rules about the data, in the data.

HumanAction is a child of Human
AnimalAction is a child of Animal

Record ID

The ID fields will keep you nicely glued to pre-1970's ISAM files, containing records, not tables with rows. The IDs are physical record pointers, not logical Keys. They do not have the qualities of a Key, and they cannot provide the integrity that Relational Keys provide.

It is not open to argument because an ID field does not qualify as a Key per the definition in the Relational Model. Using the SQL keyword PRIMARY KEY does not magically bestow the qualities of a Key upon the field.
If one were to state the difference between Relational and pre-relational DBMS, in one sentence, it would be that:

In pre-relational DBMS relationships were established using physical record IDs, but in Relational DBMS relationships are established using logical Relational Keys.

Therefore use use of ID fields, as the method of forming relationships, is unarguably pre-relational, physical, non-logical, primitive.

Further, each ID field (columns imply tables, and they are files, not tables) establishes an Access Path Dependence, which is specifically prohibited in the Relational Model. This guarantees more joins, not less joins as per the mythology.

In order to attain a Relational Database, in order to obtain Relational Integrity, power, and speed, you need Relational Keys and Relational Normalisation. Get rid of the ID fields, and choose natural Keys. Then, in the child tables, the Keys will be compounded. That is standard fare for an RDB, all the SQL platform handle it. Get used to it.

Relating Data

What you are calling "abuse", due to the books you have read, is normal, ordinary Referential Integrity. And that is the simplest, most basic level. Relational Integrity is much more than that, but it starts with that first, basic level.

So your tables will look like this. The purpose is to demonstrate Relational Keys. I need at least three levels to demonstrate Relational Integrity (that RFS cannot provide).

Human Animal Activity Data Model

What we are saying here is:

Not only HumanActivity is a child of Human, but also
HumanActivity exists only in the context of Human
(it does not exist independently, as it would, in a RFS)

I realise your tables are an example, in a real situation, they would be Normalised further. Eg. Activity would be a child of Species, rather than of an instance of Species.

That has various architectural benefits, such as your objects are now much simpler, you do not have to mess around with "inheritance" or "persistence" of data. The corollary is it is not possible to correct or fix data integrity issues in the object layers, they can only be defined to completion in the database itself.

Model

That is an IDEF1X model. IDEF1X is the Standard for modelling Relational Databases, since 1993. Please be advised that every little tick; notch; and mark; the crows foot; the solid vs dashed lines; the square vs round corners; means something very specific and important. Refer to the IDEF1X Notation. If you do not understand the Notation, you will not be able to understand or work the model.

UML, separate to it not being a standard, separate to it having a one single symbol and a squillion ever-changing notations, separate to it being a de facto free-for-all, cannot define data or the complexities of relationships in data, like the Standard for modelling Relational data can. It simply does not have the richness. The converse is, you simply do not know what you are missing.

"demanded by standard". The nice thing about standards is that there are so many to choose from. — Erwin Smout, May 26 '15 at 14:00

score 0 · Accepted Answer · answered May 21 '15 at 11:00

Ignoring your specific example, which I don't think is the point of your question, you are asking about whether to use table-driven logic or hard coded logic.

There is no simple answer to this question. One is not always better than the other and each one has advantages that make it good for a particular case.

Table-driven logic is useful when the structure of your rules will be relatively simple and consistent, but the particulars of your rules aren't known in advance or could change frequently.

In your example, new actions may become of interest on a regular basis. What do you want to do when that happens? Do you want to insert a new row in your ACTION table or do you want to go and touch all your IF/ELSE statements?

Hard coded logic is useful when the structure of your rules is static, or especially when the rules are very complex. Table driving arithmetic order of operations wouldn't make a lot of sense because they won't change and they don't suit themselves well to being table-driven.

Each approach is a technique. You goal should be reliable, clear, supportable code. Pick the approach that best helps you reach that goal in each situation.

Knowing the terms "table-driven"- and "hard coded"-logic would have made my question surely a lot simpler and shorter (unfortunately, they are not part of what Germans learn at school ;-)) Thanks for them and for your explanations! — Cologne_Muc, May 21 '15 at 17:12
@Cologne_Muc. If you understand the *Relational Model*, and that it is based on **First Order Logic**, then you may understand that all data can be, and should be defined in terms of data, in terms of FOL Predicates. Since the db is a collection of Facts, there does not exist a Fact that cannot be declared in terms of FOL, in data. "Hard-coded logic" is a violation of the *RM*, as well as **Open Architecture** and other standards. — PerformanceDBA, May 22 '15 at 01:09