4

I am new to SQL. I know only basic queries but not in depth. I searched for Self join necessary and i found this question

Most of the answer says it is useful for referencing the same table.

Example Query is:

Select e1.employeeName
     from employee e1, employee e2
     where e1.employeeid = e2.supervisorid;

Can't we get the same result by using

select employeeName
       from employee
       where employeeid = supervisorid;

Or is it self join but writing way is different? Please let me know if i understood anything wrongly.

Community
  • 1
  • 1
Gibbs
  • 21,904
  • 13
  • 74
  • 138
  • 1
    No, your second query doesn't make any sense unless a person is their own supervisor. In your first query it would be better if you used an inner join instead of the old style of cross join with some where predicates. – Sean Lange May 13 '15 at 16:47

6 Answers6

1
select employeeName
from employee
where employeeid = supervisorid;

would get you the name of the employee who is his own supervisor. Which won't work. The other query should actually be

Select e1.employeeName as employee, e2.employeeName as supervisor
from employee e1
left join employee e2 on e1.supervisorid = e2.employeeid;

to get both names, the employees and the one of the supervisor.

juergen d
  • 201,996
  • 37
  • 293
  • 362
  • What i understood from the query is retrieving supervisors. Conditions are they will have same value in employeeid column and supervisorid column. Isn't? – Gibbs May 13 '15 at 16:49
  • The condition is that the `supervisorid` can be found in the same table - he is also an employee. But of course that is a difference record in the table. If you do `where employeeid = supervisorid` you compare only a single record with itself. – juergen d May 13 '15 at 16:50
  • Yes Yes exactly. So second query returns supervisors for all the employees but first one works only if an employee is a superviosr, Am i clear? So to achieve this, we need joins. Without joins how to do this. Please update this too – Gibbs May 13 '15 at 16:51
  • 1
    *first one works only if an employee is a superviosr*. **No.** The first one does not work at all. It compares the employeeid with the supervisorid. If they are the exact same the record will be returned. So an employee would have to be his own supervisor. – juergen d May 13 '15 at 17:00
  • Yes, thanks. Please tell me whether the self join query works like for loop. if so, isn't slower – Gibbs May 13 '15 at 17:05
  • 1
    No, it is not like a loop. SQL works different. And it is really fast. – juergen d May 13 '15 at 18:08
1

In the example you linked, the table in question contained references to itself. For that particular example, it was a list of employees where some employees are supervisors of other employees. Since there is one table, as opposed to a table for supervisors and a table for non-supervisors, a self join is used for the non-supervisor to reference the supervisor in the row.

Your second query will only return results if an employee is their own supervisor.

Muhammad Abdul-Rahim
  • 1,980
  • 19
  • 31
1

The definition of a self join is a table joined to itself. The first query has a typo. It should be

Select e1.employeeName
     from employee e1, employee e2
     where e1.employeeid = e2.supervisorid;

Note the e2 in the where clause. Corrected, the first query will give you the name of every employee who is a supervisior.

The second query will give you the name of every employee who is their own supervisor. Likely an empty set.

To help you understand it, look at this example:

table employee

employeeId supervisorId employeeName
1          null         Big Boss
2          2            Medium Boss
3          3            Worker Bee

select e1.employeeName, e2.employeeName
     from employee e1, employee e2
     where e1.employeeid = e2.supervisorid;

would return:

 Medium Boss Big Boss
 Worker Bee  Medium Boss

You could make a similar solution that uses 2 table, one for employees and one for supervisors.

Table employee
employeeId employeeName SupervisorId
1          Big Boss     null
2          Medium Boss  1
3          Worker Bee   2

table supervisors
supervisorId supervisorName
1            Big Boss
2            Medium Boss

and write your query as:

Select e.employeeName, s.supervisorName 
   from supervisors s, employees e
   where s.supervisorId = e.supervisorId

which would also return:

 Medium Boss Big Boss
 Worker Bee  Medium Boss

But you would be duplicating the employee data for the supervisors, since most of them are also employees.

Barbara Laird
  • 12,599
  • 2
  • 44
  • 57
  • So First query will work like, for(each employeeid) check against supervisorid. So it will take time. Am i right? plus1 – Gibbs May 13 '15 at 16:56
  • I've added more explanation. But a self join isn't any more expensive (slower) than a join between 2 tables. – Barbara Laird May 13 '15 at 17:09
  • Thanks, I understood very well how does second query works. But really confusing the first one. let me search for that. Thanks – Gibbs May 14 '15 at 05:17
0

Your second query will only return results where employee is his own supervisor. While for the first one, it joins employee table with itself so as to get his supervisor id.

Another place where self join really comes in handy is say when you want to join a record with a date column and check if subsequent rows have same value.

EDIT - Answering the question in the comment

So, for example you have records like

ValueCol DateCol
25       '12/31/2014'
25       '1/1/2015'
30       '1/2/2015'

Your self join will look like:

Select t1.ValueCol, t2.ValueCol 
from t1
INNER JOIN t2 
ON t1.DateCol = DATEADD(DAY,1,t2.DateCol)

The result will be:

25   25
25   30

This can help you check for say equality for ValueCol between a date and a previous date in same table.

Let me know if that explains it.

Gibbs
  • 21,904
  • 13
  • 74
  • 138
stripathi
  • 766
  • 11
  • 23
0

Think differently!

Instead of "tables", think "sets", and your life in SQL will be much easier. So, in this case, you can view this "table" as a bunch of different sets, which you can define with predicates.

So, if you have an employee table, one set is all employees. The other set can be the supervisors of those employees. The JOIN clause helps you define the retrieval conditions for that second set, and you can also add more conditions in the WHERE clause if you need to.

This is Set 1 (all employees):

SELECT
  e.employeeId,
  e.employeeName,
  e.supervisorId
FROM
  employee e

This is Set 2 (all supervisors):

SELECT
  e.employeeId,
  e.employeeName
FROM
  employee e
WHERE
  e.employeeId IN
  (
  SELECT DISTINCT supervisorId FROM employee WHERE supervisorId IS NOT NULL
  )

Your SELECT query's output is a new set (or "table", if you will), which JOINs the supervisor set to the employee set, so as to associate the supervisor with the employee, thereby creating a new set, comprised of a newly-defined tuple that has more attributes.

You could treat these as "tables", and JOIN them accordingly:

SELECT
  emp.employeeId,
  emp.employeeName,
  emp.supervisorId,
  sup.employeeName supervisorName
FROM
  (
    SELECT
      e.employeeId,
      e.employeeName,
      e.supervisorId
    FROM
      employee e
  ) emp
  INNER JOIN
  (
    SELECT
      e.employeeId,
      e.employeeName
    FROM
      employee e
    WHERE
      e.employeeId IN
      (
      SELECT DISTINCT supervisorId FROM employee WHERE supervisorId IS NOT NULL
      )
  ) sup ON
    sup.employeeId = emp.supervisorId

You can see now how we have created 2 "tables" (sets) on the fly, and JOINed them. However, this contrived example is needlessly complicated. We can achieve the same thing this way:

SELECT
  emp.employeeId,
  emp.employeeName,
  emp.supervisorId,
  sup.employeeName supervisorName
FROM
  employee emp
  INNER JOIN employee sup ON
    sup.employeeId = emp.supervisorId

This achieves the same thing, namely defining a new set based on the JOIN between one set of employee tuples and another. Note that this new set can also be used as a "table". It's just a matter of how you think.

See this resource for more visualizations:

Visual Representation of SQL Joins

Pittsburgh DBA
  • 6,672
  • 2
  • 39
  • 68
0

When you have parent and child relationship in same table. Then you should use self join.

Like when to find managers and employees, menu and submenu, appversion and subversion. We generally store this type of data in one table. To query this we have to use self join.

Deepak Kumar
  • 648
  • 6
  • 14