16

I'm trying to understand a concept rather than fixing a piece of code that won't work.

I'll take a general example of a form (parent table) and a form field (child table). Logically, this would be an identifying relationship, since a form field cannot exist without a form.

form and form_field tables

This would make me think that in order to translate the logical relationship into the technical relationship, a simple NOT NULL for the form_id field in the form_field table would suffice. (See the left part of above screenshot.)

However, when I add an identifying relationship using MySQL Workbench, form_id is not only NOT NULL but also part of the primary key. (See the right part of above screenshot.) And when I add a non-identifying relationship, NOT NULL is still applied so logically it would actually be an identifying relationship as well.

I guess this confuses me a little, as well as the fact that until now I always simply used the id field as primary key.

So I understand the logical concept of identifying vs. non-identifying relationships, but I don't understand the technical part.

Why is it, as this answer states, 'the "right" way to make the foreign key part of the child's primary key'?

What is the benefit of these composite primary keys?

pnuts
  • 58,317
  • 11
  • 87
  • 139
Nic Wortel
  • 11,155
  • 6
  • 60
  • 79

2 Answers2

7

Logically, this would be an identifying relationship, since a form field cannot exist without a form.

No, identifying relationship is about identification, not existence.

Any X:Y relationship where X >= 1 guarantees existence of the left side, whether identifying or not. In your case, a 1:N relationship guarantees existence of form for any given form_field. You could make it identifying or non-identifying and it would still guarantee the same.

Remarks:

  • You would model an identifying relationship by making form_field.form_id part of a key. For example form_field PK could look like: {form_id, label}, which BTW would be quite beneficial for proper clustering of your data (InnoDB tables are always clustered).
  • Just making a PK: {id, form_id} would be incorrect, since this superkey is not a candidate key (i.e. it is not minimal - we could remove form_id from it and still retain the uniqueness).
  • You would model a 0..1:N relationship by making the form_field.form_id NULL-able (but then you wouldn't be able to make it identifying as well - see below).

There are two definitions of the "identifying relationship":

  • Strict definition: A relationship that migrates parent key into child primary key1.
  • Loose definition: A relationship that migrates parent key into child key.

In other words, the loose definition allows migration into alternate key as well (and not just primary).

Most tools2 seem to use the strict definition though, so if you mark the relationship as identifying, that will automatically make the migrated attributes part of the child PK, and none of the PK attributes can be NULL.


1 Which is then either completely comprised from migrated attributes, or is a combination of migrated attributes and some additional attributes.

2 ERwin and Visio do. I haven't used MySQL Workbench for modeling yet, but your description seems to suggest it behaves the same.

Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • So, if I understand you correctly: - The idea is that the primary key consists of two or more columns. - One of them is the foreign key to the parent table - The other is a column that is unique within all records that share the foreign key to the parent table, but (possibly) not within all records of the table - All of this is simply an alternative to an auto-incremented ID column. – Nic Wortel Nov 08 '12 at 19:42
  • @NicNLD I think you got it, except if natural key exists, it's not strictly an "alternative" to the surrogate key - **if** a label needs to be unique per form, then you'd have to make a key `{form_id, label}` **in any case**. The only question is whether you'll **also** make a surrogate key `id` (there are pros and cons for that, as you can imagine). – Branko Dimitrijevic Nov 08 '12 at 20:03
  • OK, I think I'm beginning to understand this. I'd have to make the key `{form_id, label}` to ensure there are not two duplicate labels in one form, right? But what still confuses me is that [this answer](http://stackoverflow.com/a/762994/1001110) has 164 upvotes but is telling something completely different. (although it doesn't explain **why** the PK of the parent should be part of the child's PK) – Nic Wortel Nov 08 '12 at 22:00
  • 1
    @NicNLD _"I'd have to make the key {form_id, label} to ensure there are not two duplicate labels in one form, right?"_ Correct. The sentence from the linked answer: _"An identifying relationship is when the existence of a row in a child table depends on a row in a parent table."_ is incorrect. The same guarantee can be achieved with a non-identifying relationship (as I already mentioned). – Branko Dimitrijevic Nov 08 '12 at 22:25
  • 2
    @NicNLD Let me quote [ERwin Methods Guide](http://tangra.si.umich.edu/~radev/654/resources/ERMGALL.PDF): _"In relational terms, a child entity that depends on the foreign key attribute for uniqueness is called a dependent entity"_ ...and... _"If you want the foreign key to migrate to the key area of the child entity (and create a dependent entity as a result), you can create an identifying relationship between the parent and child entities."_ So this determines **where** you migrate the key, not whether parent is "1" or "0..1" (although it could _imply_ "1", as I discussed). – Branko Dimitrijevic Nov 08 '12 at 22:27
  • 1
    @NicNLD Let me put it in a different way: can you guarantee uniqueness with child fields only, or you have to **combine** them with fields migrated from the parent? If you have to combine them, this is a dependent (aka. weak) entity and an identifying relationship. This is about **identification**-dependence, not (directly) existence-dependence. – Branko Dimitrijevic Nov 08 '12 at 22:42
  • @Branko Does that mean that we can say 1:0..1 (or 1:1 if used for some reason) relationships are always identifying? Given that the foreign key in the child table can alone form the primary key and hence an identifying relationship. – Nikunj Madhogaria Jun 27 '15 at 08:45
  • @theScorpion If you use the loose definition, then yes for 1:1. But not for 1:0..1 because NULL doesn't identify the child (well, depending [how your DBMS treats NULLs](http://dba.stackexchange.com/q/80514/5199) it technically could, but that's probably not a good idea). – Branko Dimitrijevic Jun 29 '15 at 04:07
  • @theScorpion BTW, FK doesn't have to contain the migrated key _alone_ for the relationship to be identifying. – Branko Dimitrijevic Jun 29 '15 at 04:13
  • @Branko Thanks for responding to this old post. I don't understand why can't we consider 1:0..1 relationship as identifying. I think you're considering a NULL-able FK on the parent table instead. If you're adding FK to the dependent (child) table, FK doesn't require to be NULL-able. Check this [comment](http://stackoverflow.com/questions/7644156/implementing-one-to-zero-or-one-relation-in-sql-server#comment-19414943). – Nikunj Madhogaria Jun 29 '15 at 07:50
  • @Branko Also check this [answer](http://stackoverflow.com/questions/1216805/trouble-deciding-on-identifying-or-non-identifying-relationship?lq=1#answer-1216854), which says that **one-to-** relationships are *indentifying* and **many-to-many** are *non-identifying*. – Nikunj Madhogaria Jun 29 '15 at 08:00
  • @theScorpion Hmm... I think I misunderstood your question and was thinking about a different pattern (indeed I was thinking about NULL-able FK on the parent ;) ). Yes I think you are right, 1:0..1 would indeed be identifying. It would be implemented as a child key containing FK, which fits the definition. The linked answer is incorrect - there can be one-to-many relationship that is non-identifying. Maybe the answerer meant to write **-to-one**? The sentence: "If the child identifies its parent, it is an identifying relationship" is also incorrect - it's about identifying child, not parent. – Branko Dimitrijevic Jun 29 '15 at 09:58
  • @theScorpion Just to clarify the last sentence... Parent always has to be identifiable from child anyway, otherwise DBMS could not unambiguously enforce referential integrity. That's why parent's key is _always_ migrated o the child. The only question is whether it also becomes part of the child's key or is left out of it. – Branko Dimitrijevic Jun 29 '15 at 10:03
  • @Branko I think that the answerer meant: "If the child identifies its parent just by using its PK then the relationship is identifying". Also, I think that we can call one to many relationships as identifying, when PK of the child table is composed of {FK, unique_child_attribute} [like {form_id, label} in your answer]. But, if we don't have any such unique_child_attribute (like label), then it becomes non-identifying. – Nikunj Madhogaria Jun 29 '15 at 10:12
  • @Branko and in such scenario we're bound to create a new PK (which cannot be formed by FK) and thus it becomes non-identifying. – Nikunj Madhogaria Jun 29 '15 at 10:19
  • @theScorpion _""If the child identifies its parent just by using its PK then the relationship is identifying""_ - No, the whole concept is about identifying child, not parent, as I already mentioned. Child __always__ has parent's key, but not always within child's key. _"But, if we don't have any such unique_child_attribute (like label), then it becomes non-identifying."_ - No, we can have just child key comprised from parent's key (and nothing else) and it would still be considered identifying. That's why 1:1 and 1:0..1 above would be identifying. – Branko Dimitrijevic Jun 29 '15 at 12:09
  • @Branko What I meant by the first statement was: "if a child can identify its parent by its PK only i.e. when FK is a part of its PK then we can call it identifying by definition since FK has migrated into child's PK". The second statement was a reference to one to many relationship. – Nikunj Madhogaria Jun 29 '15 at 12:22
2

An identifying relationship is supposed to be one where the primary key includes foreign key attributes. That's why when you designate a relationship as identifying the posted foreign key is deemed to be part of the primary key.

The difference between an "identifying" relationship and a non-identifying one is purely informational or diagrammatic if the same key constraints and nullability constraints apply in each case. The concept is analogous to and a consequence of designating a "primary" key. If a table has more than one candidate key then all other things being equal it doesn't matter from a logical perspective which key is designated the primary one - the form, function and (presumably) the business meaning of the table is the same.

In your example however, the keys in the two tables are NOT the same. In the first case ID is unique in the form_field table while in the second case it apparently isn't. I expect that's not what you intended.

nvogel
  • 24,981
  • 1
  • 44
  • 82
  • What do you mean with the second last sentence: the keys in the two tables are NOT the same. In the first case ID is unique in the form_field table while in the second case it apparently isn't? All id fields have an auto-increment, so they should be unique... but perhaps I didn't understand you correctly. – Nic Wortel Nov 08 '12 at 19:53
  • 1
    @NicNLD You _could_ make `id` "locally unique" in this scenario. If you chose not to however, then `{id, form_id}` is not a "good" key (it's not minimal). sqlvogel probably assumed you did the right thing (and made it "locally unique"), which would indeed mean "the keys in the two tables are NOT the same". – Branko Dimitrijevic Nov 08 '12 at 20:09
  • What Branko said. If (id) is supposed to be a candidate key then by definition (id, form_id) is not. You need to decide what the keys should be. Autoincrement by itself doesn't always make a key because an autoincrementing column isn't necessarily unique). – nvogel Nov 08 '12 at 20:19
  • @BrankoDimitrijevic can you explain "locally unique", as far as I understand it means that `form_field.id` isn't simply a copy of `form.id`, but that doesn't make sense because that would mean that `form_field.id` and `form_field.form_id` are exactly the same for every row. – Nic Wortel Nov 08 '12 at 21:49
  • @sqlvogel so essentially you are saying "it's redundant (and thereby not advisable) to add another column to the PK if the first column is already unique"? And, an autoincrement column isn't necessarily unique because of what? Because someone could change the value? – Nic Wortel Nov 08 '12 at 21:51
  • 1
    @NicNLD With "locally unique" I simply meant "unique for the same `form_id`", i.e. you could have the same `id` for a **different** `form_id`. Remember, when you make a composite key, a combination of all fields is unique, not any (proper) subset of fields. If any subset of fields could still guarantee uniqueness, then *it* would be the key. That's the difference between "superkey" and "candidate key". Any set of fields that guarantees uniqueness is a superkey, but only some of the superkeys are also candidate keys - you should only create PRIMARY KEY and UNIQUE constraints on candidate keys. – Branko Dimitrijevic Nov 08 '12 at 22:33
  • 1
    Thanks for the explanation! I'm starting to get a grip on it. In other words, a candidate key is the smallest possible superkey, without losing uniqueness. (this being a redundant description because without uniqueness it wouldn't be a superkey at all) – Nic Wortel Nov 08 '12 at 22:41