0

I am trying to query this table:

https://www.w3resource.com/mysql-exercises/subquery-exercises/find-the-names-of-the-employees-who-have-a-manager-who-works-for-a-department-based-in-united-states.php

the question is:

Write a query to find the name (first_name, last_name) of the employees who have a manager and worked in a USA based department.

Apart from the solution that is given, I tried querying using Join and another by using Temporary Table. While these two queries return the same result but the Sub Query displays a few more rows. Why is that?

using join:

select
    employees.MANAGER_ID,
    employees.FIRST_NAME,
    employees.LAST_NAME,
    locations.COUNTRY_ID
from employees
inner join departments on
    employees.DEPARTMENT_ID = departments.DEPARTMENT_ID
inner join locations on
    departments.LOCATION_ID = locations.LOCATION_ID
where locations.COUNTRY_ID = 'US';

using Temporary Table:

create temporary table deptwithloc
select
    departments.DEPARTMENT_ID,
    departments.LOCATION_ID,
    locations.COUNTRY_ID
from departments
left join locations
    on departments.LOCATION_ID = locations.LOCATION_ID
where locations.COUNTRY_ID = 'US';

Subquery:

select
    MANAGER_ID,
    FIRST_NAME,
    LAST_NAME
from employees
where MANAGER_ID 
in (select 
    EMPLOYEE_ID
from employees
where DEPARTMENT_ID

in (select 
    DEPARTMENT_ID 
from departments 
where LOCATION_ID 
in (select LOCATION_ID from locations where COUNTRY_ID='US' )));

Kindly help me to understand. New to database systems.

  • Why are you using left join? You are using it in a way where it could be replaced by inner join. LEFT JOIN returns INNER JOIN rows UNION ALL unmatched left table rows extended by NULLs. Always know what INNER JOIN you want as part of an OUTER JOIN. After a LEFT JOIN a WHERE, INNER JOIN or HAVING that requires a right [sic] table column to be not NULL removes any rows with introduced NULLs, ie leaves only INNER JOIN rows, ie "turns OUTER JOIN into INNER JOIN". PS Always chop to the smallest code that is OK code extended to code that doesn't return what you expect & say what you expected & why. – philipxy May 18 '21 at 19:57
  • Thanks for the reply @philipxy. Can you please explain the last 4 lines of your statement. I am finding it hard to follow. Sorry am a newbie !!! :) – nonsensical_coder May 19 '21 at 03:40
  • I don't know your last 4 lines because it depends on screen width. The postscript is re asking code questions. [mre] Re "turns OUTER JOIN into INNER JOIN": Trace minimal representative data through your query expression & see how my comment applies. If you don't understand then consider posting a question. But first research the problem. And it is a faq. Just googling the quoted phrase will probably turn up many many Q&As. [Left Outer Join doesn't return all rows from my left table?](https://stackoverflow.com/q/4707673/3404097) PS This is separate from your post question. – philipxy May 19 '21 at 19:32
  • https://stackoverflow.com/a/55111083/3404097 – philipxy May 19 '21 at 20:29

1 Answers1

0

Your first query is just looking at ALL employees with offices in the US, not just the managers. Your second query is applying the condition where the MANAGER's department / location is tied to the U.S. Now, the wording of the question COULD be ambiguous on which "who" worked in the U.S... It could be a PERSON works in the U.S., but the manager is based in Canada, but per their sql answer, it is the MANAGER who works in the US.

Now, that being said, I like to do indentation on my queries to see all the pieces of what I'm looking for and how I get from point A - Z along the way. By indenting you see the relationships between the tables. So lets look at this a bit in reverse. You care about managers based out of the US. So, employee -> Department -> Location -> 'US'

Start by thinking of just the MANAGER, you dont need ALL employees (yet)

select distinct Manager_ID from Employees where Manager_ID > 0

Now, take it to the department associated with "US" location

select distinct
      e.Manager_ID
   from
      employees e
         JOIN departments d
            on e.department_id = d.department_id
            JOIN locations l
               on d.location_id = l.locationID
               AND l.country_id = 'US'
   where
      e.Manager_id > 0

Now, we have a finite list of distinct managers regardless of how many employees they have assigned to them. So now you can take THIS query as the primary and get any employees associated with them -- regardless of where the employee actually works -- such as remote employee working in different country than the manager.

select
      e2.*
   from
      ( select distinct
             e.Manager_ID
          from
             employees e
                JOIN departments d
                   on e.department_id = d.department_id
                   JOIN locations l
                      on d.location_id = l.locationID
                      AND l.country_id = 'US'
          where
             e.Manager_id > 0 ) JustMgrs
         JOIN employees e2
            on JustMgrs.Manager_id = e2.manager_id

So, the second query uses the first query as its basis of records. By wrapping that in the from with ( query ) alias, You can test each individual portion of a query and once it works, that can now be applied without a full write of a temporary table. I can just use the "alias" the rest of the way. So the "JustMgr" gives me only the managers ONCE regardless of how many employees. THEN it re-joins back to the employee table FOR THAT Manager who was in the qualified list in the 'US'.

Does that help clarify why? Does this option also help you see how one could partially dissect a question, get one part first, then go to the next.

FEEDBACK / CLARIFICATION

First, I corrected the query trying to add in the last/first name of the manager, that part was incorrect and probably skewed results but now is ok since updated.

To try and clarify on your comment to explain the second part. Think of it this way, do you understand completely the basis of the first query of getting only MANAGERS based in a U.S. location. By doing DISTINCT, you are getting only one instance of a manager no matter how many employees under them. Its like going into a room of all employees of a company and asking all managers to raise their hands and you get a few people. Now you ask them if the work in a US office to come up on stage and even less of those managers. So now you have your finite list.

Now you tell all employees working for one of those managers to come along too. The second query is using the entire first result set of unique managers and pulling only employees associated with them.

When doing queries, you can write one query and make sure everything works and put that into a "from" clause. By wrapping it within parenthesis and putting a result table/alias name, you can now use THAT as if it were an actual table. You can see this as I wrapped ( the entire manager query ) JustMgrs. So now that is an alias I can use for the join back to employees to only grab those employees under those managers. Does that help?

DRapp
  • 47,638
  • 12
  • 72
  • 142
  • Thank you @DRapp for such an elaborate reply. However, please understand that it has been only a month that I picked up SQL. Can you please explain the 2nd code snippet once again? I didn't understand the origin of table e2 and JustMgrs. – nonsensical_coder May 19 '21 at 03:46
  • @nonsensical_coder, revised answer and provided clarification and query correction too. – DRapp May 19 '21 at 11:35