0

I'm having trouble understanding why the output is the way it is for this self join. The code for the table I'm dealing with is:

create table point_2d (x INT, y INT);
insert into point_2d values (-1, -1);
insert into point_2d values (0,0);
insert into point_2d values (-1, -2);

I want to execute a self join as follows:

SELECT *
FROM
point_2d p1
Inner JOIN
point_2d p2
ON p1.x != p2.y;

it's the ON clause that's confusing me with the output. How exactly is this table self-joining given the condition

p1.x != p2.y

in the code above?

kkun
  • 45
  • 6
  • Possible duplicate of [What is a self join for? (in english)](https://stackoverflow.com/questions/36352271/what-is-a-self-join-for-in-english) – philipxy Nov 06 '18 at 07:09
  • 1
    Hi. This self-join is doing what every join does, it just does it when the 2 inputs are the same value. How is an answer here not going to repeat yet another description of how join works? Can you make your question more specific? Why do you say "confusing" and how are you confused compared to if the input table values were different rather than the same & not being confused? See my answer at the link. – philipxy Nov 06 '18 at 07:14
  • My understanding is that self joins will join a table laterally. But in this case, the ON clause is not based on a shared attribute between the two tables. So how is this table being joined, when there are no shared attributes? – kkun Nov 06 '18 at 07:38
  • I don't know what you mean by "laterally" here. Again you are not explaining yourself, you are just misusing some word to not bother to say what you mean. Read the definition of join on. It is just cross join & where. Every combination of rows is made from a row from each input table value, regardless of whether given via a table name or subquery, and the ones meeting the condition are kept. Self-join is no different, it's just when the 2 inputs are equal table values. (See my link & [this one](https://stackoverflow.com/a/25957600/3404097)). Please clarify via post edits not comments. – philipxy Nov 06 '18 at 09:07
  • PS And I don't know what you mean by "not based on a shared attribute between two tables". Do you mean, not consisting of a conjunction of equalities of table refereneces where the references of each equality have the same column but different aliases? If so, so what? Why is not being so based a problem? – philipxy Nov 06 '18 at 09:56

3 Answers3

0

If you can understand the condition which is mentioned in the 'ON' is what decides the way the table is joined then it is easier to understand, in this case the condition is (joining the x values of the table denoted as p1 where it is not equal to y of the same table denoted as p2), for an example if you consider the value -1 for x in p1 then there is values such as 0, -2 for y in same table denoted as p2 so your row from p1 table where the x value is -1 is getting mapped to the same table denoted as p2 where the y values are 0, -2 since that's what the condition states.

Ajan Balakumaran
  • 1,639
  • 1
  • 8
  • 16
0

in some databases the != operator is write like <>, the query will be the same

SELECT *
FROM
point_2d p1
Inner JOIN
point_2d p2
ON p1.x <> p2.y;

If you don't like use the expllicit join, you can also use this way

SELECT *
FROM
point_2d p1, point_2d p2
WHERE p1.x <> p2.y

But I prefere the first way because it more explicit and I think you can read better the query

If you have some doubts I have found for you a list of operators used in SQL https://www.w3schools.com/sql/sql_operators.asp

  • Here I have found the documentation of a self join https://www.w3schools.com/sql/sql_join_self.asp , Howewer I remember that some years ago I used the SELF JOIN instruction in an oracle db, but now I cannot find the documentation – David Marabottini Nov 06 '18 at 07:45
  • Howewer if you want you can do a multiple insert whith the same syntax: insert into point_2d values (-1, -1),(0,0),(-1, -2); – David Marabottini Nov 06 '18 at 09:59
  • 2
    The standard syntax for not equals is `<>`. `!=` was added later and not supported by all databases. – Gordon Linoff Nov 06 '18 at 11:34
0

Your table has three rows:

 x   y
-1  -1
 0   0
-1, -2

The join is a subquery of the Cartesian product. So it is a subset of:

x1  y1       x2  y2
-1  -1       -1  -1
-1  -1        0   0    <-- p1.x <> p2.y
-1  -1       -1  -2    <-- p1.x <> p2.y
 0   0       -1  -1    <-- p1.x <> p2.y
 0   0        0   0
 0   0       -1  -2    <-- p1.x <> p2.y
-1, -2       -1  -1
-1, -2        0   0    <-- p1.x <> p2.y
-1, -2       -1  -2    <-- p1.x <> p2.y

Your condition is p1.x <> p2.y. These are shown in the results.

So the results are:

x1  y1       x2  y2
-1  -1        0   0
-1  -1       -1  -2
 0   0       -1  -1
 0   0       -1  -2
-1, -2        0   0
-1, -2       -1  -2
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786