0

Given relations A(a,b,c), B(e,f), C(d,g,h), where A has 800 tuples, B 200 and c 500. In worst case gives the expression A * B * C ( with * natural join) :

a) 800 tuples

b) 200 tuples

c) 500 tuples

d) 800*200*500 tuples

e) 800+200+500 tuples

f) Nothing from the above.

My guess was 800+200+500 since there isnt any common attribute ? And what if there was a common attribute ?

asky
  • 1,520
  • 12
  • 20
  • 1
    This is not clear. What does "since there isnt any common attribute" mean?--How does "there isn't any common attribute" lead to that result? Why do you say you are "guessing"? What is wrong with what argument? PS Please ask 1 question per post. Show your reasearch & what you are able to do & ask about where you are first stuck. Don't ask us to do your (home)work. PS [Natural join if no common attributes](https://stackoverflow.com/q/14548543/3404097) – philipxy Jun 07 '20 at 11:05
  • Does this answer your question? [maximum and minimum number of tuples in natural join](https://stackoverflow.com/questions/22673235/maximum-and-minimum-number-of-tuples-in-natural-join) – philipxy Jun 07 '20 at 11:05

3 Answers3

1

A natural join on tables that have no rows in common is in fact a cross join as you so rightly suppose. You'll get A * B * C = 800 * 200 * 500 = 80,000,000 rows.

Once the tables have columns in common a filter takes place. Depending on whether there are matches and how many, you get anything from 0 to 80,000,000 rows. Examples:

  • If all tables have one column in common and its value is the same in every row in every table, you end up with all combinations again.
  • If all tables have one column in common and its value is 'A' in all rows in table A, 'B' in all rows in table 'B' and 'C' in all rows in table C, you end up with no matches, i.e. zero rows.

After all, this all is dull theory, because nobody in their right mind would ever use a natural join :-)

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • These tables have no _columns_ (column names) in common. So the answer is nothing to do with "rows in common". Natural Join is perfectly valid for tables with no columns in common; it degenerates to a cross-join; and yes the count of rows in the result is the product of the count of rows in each conjunct. – AntC Jun 08 '20 at 00:21
  • Perhaps more people would use natural join if a) SQL had included it from the get-go (it's only recently come into the standard, and not all vendors support it); b) SQL didn't muck up natural join by allowing positional as well as named access to columns; c) SQL didn't allow nulls, leaving fear uncertainty and doubt what happens when joining a null-values column to another null-values; d) people didn't keep posting incorrect answers re natural join on StackOverflow. – AntC Jun 08 '20 at 00:28
  • @AntC: The main problem with natural joins is that you have queries running fine for years, then you merely add a column to a table and suddenly the queries become incorrect, because other tables happen to have a column of the same name. Natural joins should have never been invented for that reason. – Thorsten Kettner Jun 08 '20 at 07:21
  • adding a column to a table can banjax many queries that have been running fine. For example anything using `SELECT * FROM ...` or `SELECT 5 FROM ...`, in which the added column is before the 5th column. Natural join was 'invented' at least by 1972 IOW before SQL; 'lossless join' as a normalisation technique means natural join. Don't shift the blame on to Natural join for poor software control disciplines. – AntC Jun 08 '20 at 11:10
  • Er, no. `SELECT 5 FROM ...` just selects a 5. `ORDER BY 5` (which you may actually be referring to and which use I would't recommend anyway) would still order by the fifth column in the query. The combination of `select * ... order by 5` is insane of course. As to 1972: that was more than ten years before I started working with databases. When I started there was just comma-separated joins, as far as I can tell, i.e. cross joins. I don't think there was anything like natural joins automagically joining on columns that might be the desired ones. So, I *think* you might be wrong. – Thorsten Kettner Jun 08 '20 at 11:26
  • 1972 is Codd's paper 'Relational Completeness ...' Specifies natural join. 'Lossless Join' is from Codd 1970, and much research on normalisation esp from Ron Fagin in the early 1970's. All of that work uses plain 'Join' (or operator `*`) in the sense natural join. Please try to learn some theoretical background for your profession. Please remember SQL is a mess, almost a caricature of what a relational query language could be. OK I meant `SELECT T.5 FROM T ...` -- absolutely I don't recommend that either; did I say SQL is a mess? – AntC Jun 08 '20 at 23:00
0

I think that you mean cross join rather than natural join, since you stated that the three tables have no column in common.

In that case, you would get a cartesian product of the three tables, that is all possible combinations of rows from the tables: this gives you 800 * 200 * 500 rows in the resultset.

On the other hand, if there are 1-1 relationships across the tables (that is, 0 or 1 row in each table can be found in the other tables), and you are combining the tables with inner joins, then then you would get a subset of rows that do exist in the three tables: that's at most 200 rows (and possibly 0 rows, if no tuple can be matched across the three tables). This is not, however, what your question seems to refer to.

If you are dealing with other types of relations (one-to-many, many-to-many, ...), then there is no generic answer. It does depend on the relationships and data.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • 1
    Natural Join is perfectly valid for tables with no columns in common; it degenerates to a cross-join. So this example is _both_ a `natural join` and a `cross join`. – AntC Jun 08 '20 at 00:23
-1

Relation A contain 800 different tuple, relation B contain 200 different tuple, relation C contain 500 different tuple. In natural join it'll return the records where the 200 different values of B matches with the values of A and C so, max no of tuples will be 200