51

I thought a foreign key meant that a single row must reference a single row, but I'm looking at some tables where this is definitely not the case. Table1 has column1 with a foreign key constraint on column2 in table2, BUT there are many records in table2 with the same value in column2. There's also non-unique index on column2. What does this mean? Does a foreign key constraint simply mean that at least one record must exist with the right values in the right columns? I thought it meant there must be exactly one such record (not sure how nulls fit in to the picture, but I'm less concerned about that at the moment).

update: Apparently, this behavior is specific to MySQL, which is what I was using, but I didn't mention it in my original question.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
allyourcode
  • 21,871
  • 18
  • 78
  • 106

7 Answers7

56

From MySQL documentation:

InnoDB allows a foreign key constraint to reference a non-unique key. This is an InnoDB extension to standard SQL.

However, there is a pratical reason to avoid foreign keys on non-unique columns of referenced table. That is, what should be the semantic of "ON DELETE CASCADE" in that case?

The documentation further advises:

The handling of foreign key references to nonunique keys or keys that contain NULL values is not well defined (...) You are advised to use foreign keys that reference only UNIQUE (including PRIMARY) and NOT NULL keys.

RandomSeed
  • 29,301
  • 6
  • 52
  • 87
Hobbes
  • 978
  • 2
  • 9
  • 15
  • Thanks for citing MySQL's documentation. I didn't specifically mention MySQL in my question, but that is what I was using. I will update my question to reflect this. – allyourcode Feb 07 '10 at 05:45
  • 2
    MySQL's version 5.1 documentation goes further: "The handling of foreign key references to nonunique keys [sic] or keys that contain NULL values is not well defined.... You are advised to use foreign keys that reference only UNIQUE and NOT NULL keys." – Mike Sherrill 'Cat Recall' Jul 16 '11 at 02:42
  • This answer would be even better with a link to the cited documentation. – magnattic Mar 19 '14 at 18:00
  • 1
    Cascade on delete is a problem, since cascading effect takes place as first child is deleted, and not last. – djmj Mar 27 '14 at 02:53
  • the first link seems to be broken. Check here: https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-foreign-keys.html – Abhishek Ghosh Aug 27 '21 at 07:38
7

Your analysis is correct; the keys don't have to be unique, and constraints will act on the set of matching rows. Not usually a useful behavior, but situations can come up where it's what you want.

chaos
  • 122,029
  • 33
  • 303
  • 309
  • At first, I was a little confused about what "your analysis" was supposed to refer to. For other readers, I think you meant it's possible for a foreign key to refer to more than one row, as long as there's _at least_ one row. – allyourcode Feb 26 '09 at 01:52
  • 3
    Your understanding of foreign keys is rightfully messed up, because this is a bogus answer. Even if you find a database where you can do this (I did not test MySQL), it would very, _very_ bad database design to implement such a foreign key relationship. – cdonner Feb 26 '09 at 02:43
  • 3
    @cdonner: Well, I would've thought that being *correct* would've been some defense against my answer being 'bogus'. What am I supposed to do, lie tell the guy you can't do it since I consider it a generally bad idea (which I do)? And yes, you can do it in mysql. – chaos Feb 26 '09 at 03:58
  • @allyourcode: Really, you shouldn't throw out your former understanding of foreign keys. It's correct for 99% of non-pathological cases. – chaos Feb 26 '09 at 03:59
  • @chaos: I stand corrected - at least somewhat. I know little about MySQL. Can you give one example where that would be a useful practice? – cdonner Feb 28 '09 at 14:44
4

When this happens, it usually means that two foreign keys are being linked to each other. Often the table that would contain the key as a primary key isn't even in the schema.

Example: Two tables, COLLEGES and STUDENTS, both contain a column called ZIPCODE.

If we do a quick check on

SELECT * FROM COLLEGES JOIN STUDENTS ON COLLEGES.ZIPCODE = STUDENTS.ZIPCODE

We might discover that the relationship is many to many. If our schema had a table called ZIPCODES, with primary key ZIPCODE, it would be obvious what's really going on.

But our schema has no such table. Just because our schema has no such table doesn't mean that such data doesn't exist, however. somewhere, out in USPO land, there is just such a table. And both COLLEGES.ZIPCODE and STUDENTS.ZIPCODE are references to that table, even if we don't acknowledge it.

This has more to do with the philosophy of data than the practice of building databases, but it neatly illustrates something fundamental: the data has characteristics that we discover, and not only characteristics that we invent. Of course, what we discover could be what somebody else invented. That's certainly the case with ZIPCODE.

Fahim Parkar
  • 30,974
  • 45
  • 160
  • 276
Walter Mitty
  • 18,205
  • 2
  • 28
  • 58
3

Yes, you can create foreign keys to basically any column(s) in any table. Most times you'll create them to the primary key, though.

If you do use foreign keys that don't point to a primary key, you might also want to create a (non-unique) index to the column(s) being referenced for the sake of performance.

Depends on the RDBMS you're using. I think some do this for you implicitly, or use some other tricks. RTM.

Evan
  • 18,183
  • 8
  • 41
  • 48
  • MySQL (InnoDB) requires a (left-hand) index on the referencing and referenced columns. JADP. – chaos Feb 26 '09 at 03:46
1

You shouldn't reference a non-unique key as foreign key.

What you can do instead (without delete cascade danger) is add a check constraint (at least in MS-SQL). That's not exactly the same as a foreign key, but at least it will prevent the insertion of invalid/orphaned/dead data.

See Foreign Key to non-primary key for reference. (You'll have to port the MS-SQL code to MySQL syntax.)

According to Mysql CHECK Constraint, MySQL doesn't support CHECK constraints. You can define them in your DDL query for compatibility reasons, but they are just ignored.

But as mentioned there, you can create a BEFORE INSERT and BEFORE UPDATE trigger, which will throw an error when the requirements of the data are not met, which is basically the same thing, except that it's an even bigger mess.

I thought a foreign key meant that a single row must reference a single row, but I'm looking at some tables where this is definitely not the case.

In any sane RDBMS, this is true. The fact that this is possible in MySQL is just one more reason why MySQL is an insane RDBMS. It may be fast, but sacrificing referential integrity and data quality on the altar of speed is not what makes a quality RDBMS. If it's not ACID-compliant, it's not really a RDBMS.

Stefan Steiger
  • 78,642
  • 66
  • 377
  • 442
  • InnoDB is [very close to being ACID-compliant](http://dev.mysql.com/doc/refman/5.7/en/mysql-acid.html). IMHO, close enough for most applications. And it offers performance boosts with MyISAM when you don't really need an ACID compliant database. It is a quality rdbms, but it takes a good DBA to know when it's the best choice. – GabrielF Feb 25 '15 at 18:23
  • @GabrielF: Agreed that it can be a good choice under some circumstances. Also agreed that it needs a good DBA. But most people do not really understand the finer points of morality in SQL. Also, many companies don't have a DBA, let alone a good one.In which case it is a terrible terrible idea. – Stefan Steiger Mar 10 '20 at 12:06
1

PostgreSQL also refuses this (anyway, even if it is possible, it does not mean it is a good idea):

essais=> CREATE TABLE Cities (name TEXT, country TEXT);
CREATE TABLE
essais=> INSERT INTO Cities VALUES ('Syracuse', 'USA');
INSERT 0 1
essais=> INSERT INTO Cities VALUES ('Syracuse', 'Greece');
INSERT 0 1
essais=> INSERT INTO Cities VALUES ('Paris', 'France');
INSERT 0 1
essais=> INSERT INTO Cities VALUES ('Aramits', 'France');
INSERT 0 1
essais=> INSERT INTO Cities VALUES ('Paris', 'USA');
INSERT 0 1

essais=> CREATE TABLE People (name TEXT, city TEXT REFERENCES Cities(name));
ERROR:  there is no unique constraint matching given keys for referenced table "cities"
bortzmeyer
  • 34,164
  • 12
  • 67
  • 91
  • This was not mentioned. And, since MySQL is especially sloppy in checks, constraints and controls, it would not surprise me that integrity references to non-unique keys are possible with MySQL. – bortzmeyer Feb 28 '09 at 10:41
  • I don't think it has to do with being sloppy. Postgresql reserves this functionality in the keyword `MATCH PARTIAL` it just isn't implemented afaik. – Evan Carroll Aug 10 '10 at 19:34
0

What database are we talking about? In SQL 2005, I cannot create a foreign key constraint that references a column that does not have a unique constraint (primary key or otherwise).

create table t1
(
  id int identity,
  fk int
);

create table t2
(
  id int identity,
);

CREATE NONCLUSTERED INDEX [IX_t2] ON [t2] 
(
    [id] ASC
);
ALTER TABLE t1 with NOCHECK
ADD CONSTRAINT FK_t2 FOREIGN KEY (fk)
    REFERENCES t2 (id) ;


Msg 1776, Level 16, State 0, Line 1
There are no primary or candidate keys in the referenced table 't2' 
that match the referencing column list in the foreign key 'FK_t2'.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.

If you could actually do this, you would effectively have a many-to-many relationship, which is not possible without an intermediate table. I would be truly interested in hearing more about this ...

See this related question and answers as well.

Community
  • 1
  • 1
cdonner
  • 37,019
  • 22
  • 105
  • 153
  • @allyourcode: do you mind sharing what these 2 tables do in your database that have this m:n foreign key relationship? – cdonner Feb 28 '09 at 14:48