0

The following question was asked on Stack Overflow:

"I need to use Self Join on this table.

+------------+------+--------+
| Country    | Rank |  Year  |
+------------+------+--------+
|France      |  55  |  2000  |
+------------+------+--------+
|Canada      |  30  |  2000  |
+------------+------+--------+ 
|Liberia     |  59  |  2001  |
+------------+------+--------+ 
|Turkey      |  78  |  2000  |
+------------+------+--------+ 
|Japan       |  65  |  2003  |
+------------+------+--------+
|Romania     |  107 |  2001  |
+------------+------+--------+

I need to use self join to get what countries has the same year as Turkey. Display the Country and year only."

In the answer chosen as correct, one of the recommended queries is:

SELECT DISTINCT a.Country, a.Year 
FROM table1 AS a 
INNER JOIN table1 AS b 
   on a.Year=b.Year 
  and b.Country='Turkey';

I don't get this query. Isn't it that a.Year=b.Year always going to be true - as both the tables are the same? So what is the need of using it? And isn't it that the above query will return only 'Turkey' in the country?

Please help me understand if I'm wrong.

Thanks a lot!

Lemmy
  • 223
  • 2
  • 12
  • 1
    add your expected output by the query and what condition you would apply – er.irfankhan11 Oct 23 '18 at 05:07
  • `a.Year=b.Year` is a join criteria. Its not always true because every year in the table isn't the same as every other year. The query should return France, Canada and Turkey. You can always [just test run and see](https://dbfiddle.uk/?rdbms=mariadb_10.3&fiddle=89145d23c7ab8498dd6ced66d948efd7) – danblack Oct 23 '18 at 05:10
  • Possible duplicate of [Explanation of self-joins](https://stackoverflow.com/questions/2458519/explanation-of-self-joins) – philipxy Oct 23 '18 at 05:59
  • The `DISTINCT` would only make sense if the table could contain multiple records for Turkey. And please see Tim's answer; one would solve such tasks with a subquery rather than with a join anyway. – Thorsten Kettner Oct 23 '18 at 06:00
  • This is a faq, you are asking how select statements work. Please read [ask] & the downvote arrow mouseover text. (The condition tells you what rows to keep from a cross join--what rows to keep from every row you can make from a row from each input.) PS [Re self-join.](https://stackoverflow.com/a/37384306/3404097) – philipxy Oct 23 '18 at 06:01

3 Answers3

0

If you have to use a self-join here, then try this:

SELECT t1.Country, t1.Year
FROM table1 t1
INNER JOIN table1 t2
    ON t1.Year = t2.Year AND t2.Country = 'Turkey';

enter image description here

Demo

This isn't what I would consider a typical candidate for a self-join query. In this case, the second (right) table just serves to represent the year which Turkey has.

I would rather just use a subquery:

SELECT Country, Year
FROM table1
WHERE Year = (SELECT Year FROM table1 WHERE Country = 'Turkey');

Note that this would also include Turkey itself in the result set. If you don't want to also see Turkey, then we can add another condition in the WHERE clause:

SELECT Country, Year
FROM table1
WHERE Year = (SELECT Year FROM table1 WHERE Country = 'Turkey') AND Country <> 'Turkey';
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • 3
    Getting past the "need to use a self-join" statement, the question was more asking about what a join condition means rather than rewrite as a subquery. Seems fine enough as a self-join to me. – danblack Oct 23 '18 at 05:20
  • @danblack I disagree, and I would rather inherit the subquery code. Why? Because it is explicit that we are filtering the table based on Turkey's year. – Tim Biegeleisen Oct 23 '18 at 05:21
  • @danblack: I agree that the question is on how a (self) join actually works. But it's correct that such tasks should really not be solved with a self-join, but with a subquery. So it's good Tim points this out, as the OP is a beginner obviously. Criteria belongs in the `WHERE` clause. This gets the query much more readable. – Thorsten Kettner Oct 23 '18 at 05:58
0

Isn't it that a.Year=b.Year always going to be true?

No. Think of this as first taking Cartesian product of A and B (matching all rows of a with all rows of b) and then selecting those rows where a's year and B's year is same. This yields the data where two countries have same year.

Using this data we can further determine the countries who form a pair with Turkey.

This is what this query is doing.

Prakhar Londhe
  • 1,431
  • 1
  • 12
  • 26
0

Keep in mind that you are joining rows. WHERE clauses and ON clauses always refer to a certain row per table at a time. This pseudo code shows what is actually happening in the join:

foreach a = row of table1
  if a.country = 'Turkey' then
    foreach b = row of table1
      if b.year = a.year then
        keep the joined row
      endif
    endloop
  endif
endloop

(The DBMS may use another approach of matching the records. but you can imagine it like above.)

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73