Find the Cardinality of Natural Join

Question

|X| represents number of tuples in X
bold letters represent keys in the relation

Consider the relations R(A, B) and S(A, C), and that R has a foreign key on A that references S.
|R ✶ S| (where ' * ' represents natural join) is:
The options are:
1. |R|
2. |S|
3. |R|.|S|
4. max(|R|, |S|)
5. min(|R|, |S|)

What I understand about the cardinality of natural join is that if there is no common attribute among the two relations then natural join will act like a cross-product and the cardinality will be r * s. But I don't understand how key constraints play a role in determining the cardinality. Can someone please explain?

. . "Natural" joins are not "natural" in any sense of the word. A foreign key might or might not share the name of the primary key. Other columns might or might not have the same name. The any is any value between 0 and R * S. — Gordon Linoff, May 10 '20 at 19:06
So, if there is a foreign key in R which is A that references the primary key of S which is A. It means that all the values in R should be present in S ? — rdr2, May 10 '20 at 19:12
@Gordon (again) Natural joins are perfectly natural if you design the schema naturally. In this case the `A`s in the two schemas are in bold. Does that mean they are keys? As well as `A` in `R` being a Foreign key referencing `A` in `S`. That's a rather unusual arrangement, suggesting this schema is a 6NF normalisation of what started as a single table. — AntC, May 10 '20 at 21:36
@ndr2 . . . There is no guarantee that any columns in the two tables have the same name. Hence, even with a foreign key relationship, the zero rows might match. There is nothing "natural" about a "natural join". — Gordon Linoff, May 10 '20 at 22:16
What did working through some examples suggest? See [ask], hits googlig 'stackexchange homework' & the voting arrow mouseover texts. PS SQL is not relational algebra. Pick 1 & delete the other tag. PS There are many relational algebras. Which is yours? PS This is a faq. Before considering posting please read your textbook and/or manual & google any error message or many clear, concise & precise phrasings of your question/problem/goal, with & without your particular strings/names & site:stackoverflow.com & tags; read many answers. Reflect your research. — philipxy, May 10 '20 at 23:02
Does this answer your question? [maximum and minimum number of tuples in natural join](https://stackoverflow.com/questions/22673235/maximum-and-minimum-number-of-tuples-in-natural-join) — philipxy, May 10 '20 at 23:07

score 2 · Accepted Answer · answered May 10 '20 at 21:50

Presuming the bold A in each schema means it is a key; and presuming the Foreign Key constraint holds -- that is, the A value for every row in R does correspond to an A value in S:

Every row in R naturally joins to a row in S on A.
There might be rows in S that don't join to R (because there's no Foreign Key constraint to enforce that).
So the cardinality of the joined relations is the cardinality of R, answer 1.

Is there are real-life use for a schema like this? Consider S is Customer Name in C, keyed by Customer number in A. R holds date of birth in B, also keyed by Customer number in A. Every Customer must have a name; it's true every Customer (person) must have a d.o.b., but we don't need to record that unless/until they purchase age-restricted items.

@rdr2 . . . It is sad that you accepted this answer. The question is so malformed that it doesn't have an answer. You can make so many assumptions not mentioned in the question to come up with an answer, but that doesn't change the vagueness of the question. I don't downvote answers, but I don't think that this really addresses the question. — Gordon Linoff, May 11 '20 at 12:56

Gordon Linoff · Answer 2 · 2020-05-11T12:54:22.603

There is absolutely not enough information to answer this question. The "natural" join can return any almost any value between 0 and R*S. The following are examples.

This example returns 12:

create table s1 (id int primary key);
create table r1 (s1_id int references s1(id));

insert into s1 (id) values (1), (2), (3);

insert into r1 (s1_id) values (1), (2), (2), (3);

This example returns 0:

create table s2 (id int primary key, x int default 2);
create table r2 (s2_id int references s2(id), x int default 1);

insert into s2 (id) values (1), (2), (3);

insert into r2 (s2_id) values (1), (2), (2), (3);

This examples returns 4:

create table s3 (id int primary key, y int default 2);
create table r3 (id int references s3(id), x int default 1);

insert into s3 (id) values (1), (2), (3);

insert into r3 (id) values (1), (2), (2), (3);

In all of these, r has a foreign key relationship to s. And a "natural" join is being used. Here is a db<>fiddle.

Even if you assume that the "A"s are the primary keys AND that there are no other columns, the number of rows still varies:

-- returns 4
create table s5 (id int primary key);
create table r5 (id int references s4(id));

insert into s5 (id) values (1), (2), (3);    
insert into r5 (id) values (1), (1), (2), (2);

Versus:

-- returns 0
create table s4 (id int primary key);
create table r4 (id int references s4(id));

insert into s4 (id) values (1), (2), (3);

insert into r4 (id) values (NULL), (NULL), (NULL), (NULL);

Gordon you've not noticed the bolded **A**s at the start of the q. Admittedly the q doesn't say what that means, but it's a reasonable guess they're denoting the keys. I made that guess/assumption explicit in my answer. None of your examples declares a key for the `r` table. Only one of your examples same-names the (Foreign) Key attributes between the two tables. Since same-naming is the whole point of Natural Join, I think you've "absolutely" failed to answer the q (even given its sloppiness). — AntC, May 11 '20 at 10:41

Find the Cardinality of Natural Join

2 Answers2