1

I have two tables

Table A (Primary Key is ID) id \ firstname \ lastname \ zip \ state

Table B some_field\ business name \ zip \ id

I need to get the first name and last name associated with the id using the id from Table B (note this is the same id as in Table A)

I did a JOIN on Table A and Table B so that I could get the first name and last name

A friend of mine said I should not use JOIN this way and that I should of just done two separate queries. Does that make any sense?

Does JOIN do anything that makes the process slower than two seperate queries? How could two seperate queries ever be faster than one query?

Shammy
  • 113
  • 4
  • see this thread:http://stackoverflow.com/questions/1067016/join-queries-vs-multiple-queries – Suchit kumar Jan 12 '17 at 05:19
  • 1
    FirstName & Last Name are part of TableA only. Then why to join TableB.. – Shakeer Mirza Jan 12 '17 at 05:19
  • A join query could be more taxing on the database but cost less time overall because of network latency etc. – shmosel Jan 12 '17 at 05:22
  • 1
    A join query using properly indexed fields is certainly a lot LESS taxing than executing two separate queries. The primary key is automatically indexed, so the result of a join is a single lookup. Joining the two tables is the right thing to do. But yes there is something fishy in the language of the question, if you do not need a field in table B, don't join it for no reason. – Drunken Code Monkey Jan 12 '17 at 05:59

2 Answers2

3

Q: Does this make sense?

A: No, without some valid reasons, it doesn't make sense.

Q: Does JOIN do anything that makes the process slower than two separate queries?

A: Yes, there are some things that can make a join slower, so we can't rule out that possibility. We can't make a blanket statement that "two separate queries will be faster" or that a "join will be slower".

An equijoin of two tables that are properly indexed is likely to be more efficient. But performance is best gauged by actually executing the statements, at expected production volumes of data, and observing and measuring performance.

Some of the things that could potentially make a join slower... complicated join predicate (involving columns wrapped in functions, inequality comparisons, compound predicates combined with OR, multiple tables involved where the optimizer has more join paths and operations to consider to come up with an execution plan. Or, a join that produces a hugh jass intermediate result that is later collapsed with a GROUP BY. (In short, it is possible to write a horrendously inefficient statement that uses a join operation. But it is usually not the join operation that is the culprit. This list of things is just a sampling, it's not an exhaustive list.)


A JOIN is the normative pattern for the use case you describe. It's not clear why your friend recommended that you avoid a JOIN operation. what reason your friend gives.

If your main query is primarily against (the unfortunately named) Table_B, and you are wanting to do a lookup of first_name and last_name from Table_A, the JOIN is suited to that.

If you are only returning a one row (or a few rows) from Table_B, then an extra roundtrip for another query to get first_name and last_name won't be a problem. But if you are returning thousands for rows from Table_B, then executing thousands of separate, singleton queries against Table_A is going to kill performance and scalability.

If your friend is concerned that a value in the foreign key column in Table_B won't match a value in the id column of Table_A, or there is a NULL value in the foreign key column, your friend would be right to point out that an inner join would prevent the row from Table_B from being returned.

In that case, we'd use an outer join, so we can return the row from Table_B even when a matching row from Table_A is not found.

Your friend might also be concerned about performance of the JOIN operation, possibly because your friend has been burned by not having suitable indexes defined.

Assuming that a suitable index exists on Table_A (with a leading column id). and that id is UNIQUE in Table_A... then performance of a single query with a simple equijoin between a single column foreign key and single column primary key will likely be more efficient than running a bloatload of separate statements.

Or, perhaps your friend is concerned with issue with an immature ORM framework, one that doesn't efficiently handle the results returned from a join query.

If the database is being implemented in way that the two tables could be on separate database servers, then using a JOIN would fly in the face of that design. And if that was the design intent, a separation of the tables, then the application should also be using a separate connection for each of the two tables.

Unless your friend can provide some specific reason for avoiding a JOIN operation, my recommendation is that you ignore his advice.

(There has to be a good reason to avoid a JOIN operation. I suspect that maybe your friend doesn't understand how relational databases work.)

spencer7593
  • 106,611
  • 15
  • 112
  • 140
0

In your case it doesn't make any big difference because you just have an id as a foreign key on it which anyways has an index. Since it's indexed, it will be efficient and having a join on that is the best thing.

It becomes more complicated based on what you want, what are the fields and what you want to accomplish etc.

So, yes, no big difference in your case.

  • 3
    There are so many unknown variables here. You can't throw out a blanket statement like that. – shmosel Jan 12 '17 at 05:23
  • 1
    If the query on `Table_B` is returning 15,000 rows, there *is* going to be a difference. Executing 15,000 separate statements (roundtrips) to the database to get a row from `Table_A` for each row returned from `Table_B` can be agonizingly slow. If we're only retrieving a couple of rows from `Table_B`, for example, with no WHERE clause on `Table_B` and cutting off the resultset with a `LIMIT 1`,,, running a separate query to get a row from `Table_A` could be *faster* than preparing a hugh jass intermediate resultset. There can be some *significant* differences. – spencer7593 Jan 12 '17 at 06:25