Why do logical reads increase when I join to another table

Question

I have been looking through some long running queries in our database with the aim of reducing the time taken. One thing I am looking at is the number of logical reads for each table, for each query, but I am not sure why the logical reads are so high for certain tables.

Rather than posting the long queries, I can show a very basic example that recreates the "problem". I have created this query:

SELECT p.Id
FROM 
    Products p
    INNER JOIN StaffMembers sm ON p.CreatedById = sm.Id AND sm.IsDeleted = 0
WHERE p.DateCreated > '01-MAY-2021'

The logical reads for the StaffMembers table is 6, the number of rows returned is 219

Execution Plan

If I change the query slightly, to bring in another table:

SELECT p.Id
FROM 
    Products p
    INNER JOIN StaffMembers sm ON p.CreatedById = sm.Id AND sm.IsDeleted = 0
    INNER JOIN ProductCategories pc ON p.Id = pc.ProductId AND pc.IsDeleted = 0
WHERE p.DateCreated > '01-MAY-2021'

The logical reads for StaffMembers is now 13, the number of rows is still 219.

Execution Plan

The number of records returned by the query is exactly the same. From looking at the Execution Plan the Index Scan used for StaffMembers is exactly the same. So, why is the logical reads higher for the second query?

Could you [include the actual Execution Plan](https://stackoverflow.com/a/7359705/1260204), you could use [Paste the Plan](https://www.brentozar.com/pastetheplan/) and share the link in your question. — Igor, May 18 '21 at 20:37
Upload the actual execution plan to [Paste The Plan](https://www.brentozar.com/pastetheplan/) and add the link to your question. — Dan Guzman, May 18 '21 at 20:38
As requested add the actual execution plan for both queries. It would also be useful to have the table schema and index create scripts for each table. — Stu, May 18 '21 at 21:11
Thanks for your replies. I have added the execution plans now — keithm, May 18 '21 at 21:29
It appears the additional logical reads are because the plan went parallel, the reads are apportioned over 4 threads. What is the cost threshold setting of the server, it appears it's likely the default of 5; try setting this higher, also what do you see if you add `with(maxdop 1)` — Stu, May 18 '21 at 21:38
@Stu - Thanks! Yes, it is set to 5. If I use the `maxdop 1` hint the logical reads goes back to 6. I guess this is what you'd expect? So if a plan goes parallel then the logical reads will increase-that makes sense. That's not necessarily a bad thing, then? As you may have guessed, I am not a DBA :-) — keithm, May 18 '21 at 21:53

score 1 · Accepted Answer · answered May 18 '21 at 22:10

Adding some more detail here which is too long for a comment.

In the execution plan for the second query, select the select operator. View the properties (or look at the mouse-over popup) and note the estimated subtree cost is 8.1; this is SQL Server's guess of how much work is involved, it's higher than the default threshold of 5 so (usually) this means it will consider a parallel plan; not always, some elements of a query can force a serial plan, but generally.

A merge join method was chosen between the two tables and the inputs to the operator require sorted inputs, so the work was apportioned over multiple threads.

If you select the index scan operator on StaffMembers and look in the properties, note under actual I/O statistics are your observed logical reads and how they were distributed across all threads.

given the query completes in <1 second there is likely no need for it to go parallel, only doing so because the cost threshold for parallelism is set far too low (it's one of SQL Servers ageing defaults that should really be revised). A recommended value is somewhere between 40-60.

Of course the query is doing a table scan of Products and ProductCategories and some additional supporting indexes would be beneficial, and help improve the estimated cost with better statistics.

Thanks for your detailed answer. I didn't even know you could see the Actual Logical Reads split out by thread like that! I updated the SQL Server setting to 50, re-ran the query and the logical reads stayed at 6. This query in the question was purely for demonstration purposes so I'm not worried about the table scan of `Products` and `ProductCategories`. Interestingly, when I re-run the query that I am trying to optimise, the logical reads did not change when running on 1 thread. I'll keep looking into it though-thanks for answering this question. — keithm, May 18 '21 at 22:34

Why do logical reads increase when I join to another table

1 Answers1