Retrieving data from 2 tables that have a 1 to many relationship - more efficient with 1 query or 2?

Question

I need to selectively retrieve data from two tables that have a 1 to many relationship. A simplified example follows.

Table A is a list of events:

Id  |  TimeStamp  |  EventTypeId
--------------------------------
1   |  10:26...   |  12
2   |  11:31...   |  13
3   |  14:56...   |  12

Table B is a list of properties for the events. Different event types have different numbers of properties. Some event types have no properties at all:

EventId | Property | Value
------------------------------
1       | 1        | dog
1       | 2        | cat
3       | 1        | mazda
3       | 2        | honda
3       | 3        | toyota

There are a number of conditions that I will apply when I retrieve the data, however they all revolve around table A. For instance, I may want only events on a certain day, or only events of a certain type.

I believe I have two options for retrieving the data:

Option 1
Perform two queries: first query table A (with a WHERE clause) and store data somewhere, then query table B (joining on table A in order to use same WHERE clause) and "fill in the blanks" in the data that I retrieved from table A.

This option requires SQL Server to perform 2 searches through table A, however the resulting 2 data sets contain no duplicate data.

Option 2
Perform a single query, joining table A to table B with a LEFT JOIN.

This option only requires one search of table A but the resulting data set will contain many duplicated values.

Conclusion
Is there a "correct" way to do this or do I need to try both ways and see which one is quicker?

score 0 · Answer 1 · edited May 23 '17 at 12:04

Ex

Select E.Id,E.Name from Employee E join Dept D on E.DeptId=D.Id

and a subquery something like this -

Select E.Id,E.Name from Employee Where DeptId in (Select Id from Dept)

When I consider performance which of the two queries would be faster and why ?

would EXPECT the first query to be quicker, mainly because you have an equivalence and an explicit JOIN. In my experience IN is a very slow operator, since SQL normally evaluates it as a series of WHERE clauses separated by "OR" (WHERE x=Y OR x=Z OR...).

As with ALL THINGS SQL though, your mileage may vary. The speed will depend a lot on indexes (do you have indexes on both ID columns? That will help a lot...) among other things.

The only REAL way to tell with 100% certainty which is faster is to turn on performance tracking (IO Statistics is especially useful) and run them both. Make sure to clear your cache between runs!

More REF

Retrieving data from 2 tables that have a 1 to many relationship - more efficient with 1 query or 2?

1 Answers1