0

Related to Join vs. sub-query but a different type of situation, and just trying to understand how this works.

I had to make a view where I get a bunch of employee codes from one table, and I have to get their names from a different table - the same two tables every time. I arranged my query like this:

SELECT
    (SELECT name from emptable where empcod = code1) as emp1, code1,
    (SELECT name from emptable where empcod = code2) as emp2, code2,
    [repeat 6 times]
FROM codetable

It is more complicated than this, and more tables are joined, but this is the element I want to ask about. My boss says joining like so is better:

SELECT e1.name, c.code1, e2.name, c.code2, e3.name, code3 [etc]
FROM codetable c
INNER JOIN emptable e1 ON e1.empcod = c.code1
INNER JOIN emptable e2 ON e2.empcod = c.code2
INNER JOIN emptable e3 ON e3.empcod = c.code3

My reasoning, aside from not having to go search in the joins which table gets whose name and why, is the way I understand the join goes like this:

  • Take whole table A
  • Take whole table B
  • Combine all the data from both tables according to the 'ON' section of the join
  • select one single string from this complete combination of two whole tables from which I need no other data

I think it's obvious that this seems like it would take up a lot of resources. I understand the subquery as

  • Get one datum from table A (the employee code)
  • Match this one datum to every record from table B until you find a match
  • As soon as you get a match, bring back this one single datum from this other table (the employee's name)

Understanding that in the table of employees, the employee code is a primary key and cannot be duplicated, so every subquery can only ever give me one single string back.

It seems to me that comparing ONE number from one table to ONE number from another table and retrieving ONE string related to that number would be less resource-intensive than matching ALL of the data in two whole tables together in order to get this one string. But I figure I don't know what these databases are doing behind the scenes, and a lot of people seem to prefer joins. Can anyone explain this to me, if I'm understanding it wrong? The other posts I find here of similar situations tend to want more information from more tables, I'm not immediately finding anything about matching the same two tables six or seven times to retrieve one single string for every configuration.

Thanks in advance.

Trisha
  • 3
  • 6
  • The answer is simple .. all the joins work one time only ... each select in you select clause work for each row .. so you have one query for join version instead of 6 x rows for the select in select clause .. – ScaisEdge Nov 10 '21 at 11:59
  • This is not the answer to my question, nor is it presented as such (I can't mark a comment as the correct answer). "You have one query for [the] join version instead of 6x rows for the select in select clause" doesn't mean anything to me - I know I have either one or six queries, I don't know which would be better and why. A join links all data from both tables together before grabbing the one string I want, yes? That's what I'm finding online, at least. So if you're saying the six subselects are more resource-intensive than the six joins, why? It makes no sense to me. – Trisha Nov 15 '21 at 09:00
  • If you have proper index on the joined column .. the single query for join is execute one time only and work on indexed set of data (fast) .. the select subquery for the main select don't work 6 time (as the numers of column) but work 6 time for each rows .. if you have 100 rows you execute 6x100 select ... hope is more clear .. – ScaisEdge Nov 15 '21 at 09:12
  • Okay, I get it. If you have a lot of rows, then making the join is only going to require time once for all of the rows combined, whereas each of the subselects will be executed once for every row. I guess for my situation where there's only ever one row involved, the difference is negligible anyway. Thanks. – Trisha Nov 15 '21 at 09:44

1 Answers1

0

So as ScaisEdge explained, a join only gets executed once - and thus only spends time and resources once - no matter how many rows you have, whereas each of the six subselects get executed once for every row. If you have 100 rows, you're executing six joins once or you're executing 6 subselects 100 times. It makes sense that this would be more resource-intensive, and I did not explain clearly enough that my case involves only one row at a time - in which case I guess the difference would be negligible anyway.

Trisha
  • 3
  • 6