0

Background

I have two tables - one is updated each day with new data (failed deadline) and one is continually added and deleted from dependent on the other (live table).

The way it works is that if the persons allocated work appears in the 'failed deadline' table, then add to the 'live' table. If the persons allocated work is then not on the 'failed deadline' table the next day, remove from the 'live' table.

The two biggest issues here are that whilst the ID number of the work is an easy check, it also has to check to see if the work is still allocated to the same employee, so that if it changes hands it doesn't count against them.

The problem I'm facing is that the delete portion of the code seems to run over 3,000 rows and return about 300 just fine, but this SQL statement is taking over 4 minutes and 30 seconds to execute. What is the most efficient way of writing this statement? My gut instinct is telling me that the two statements are different and combining them is the issue.

DELETE FROM [LiveTable]
WHERE [WorkID] NOT IN (SELECT [WorkID]
                       FROM [FailedDeadline])
   OR [LiveTable].[EmployeeID] <> (SELECT [EmployeeID]
                                   FROM [FailedDeadline] as fd
                                   WHERE [LiveTable].[WorkID] = fd.[WorkID]);
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
RazorKillBen
  • 561
  • 2
  • 20
  • Can you [Paste the Plan](https://www.brentozar.com/pastetheplan/)? A `NOT EXISTS` and a `NOT IN` aren't going to perform too differently, however, a `NOT IN` does have problems with `NULL`s, so you probably want the latter. For the second subquery, perhaps a `LEFT JOIN` might be more performant, but it's unlikely to make a big dent. – Thom A Mar 08 '19 at 10:39
  • You could replace the `WHERE NOT IN` with `WHERE NOT EXISTS`, but SQL Server probably would already have optimized it under the hood. Beyond that, have a look at this [canonical DBA question](https://dba.stackexchange.com/questions/102571/improve-delete-speed-for-sql-server) which discusses ways to improve the speed of a delete. – Tim Biegeleisen Mar 08 '19 at 10:39
  • My gut instinct is telling me that those NOT and this OR combined with that <>, is the issue. Your query falls into a full table scan because those operators combined are most of the time evil resulting in indexes not being used. Show the explain plan – Thomas G Mar 08 '19 at 10:41
  • It might be a tad difficult @Larnu just because the amount of cleanup required to not post business-specific information and schema info might be a bit too much. Is there any specific point I could share? The graphical plan shows me that Filter and Table Scan take up 43% and 32% respectively. – RazorKillBen Mar 08 '19 at 10:47
  • @Larnu - A `NOT IN` certainly can perform much worse than `NOT EXISTS` - especially if the columns being compared are `NULL`-able on both sides. Because SQL Server then has to add a load of extra operators to the plan to deal with this that can make things significantly more expensive. See for example https://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/ (logical reads on SmallerTable 500011 and elapsed time = 825 ms vs logical reads 9 and elapsed time = 228 ms) and another look at it here https://stackoverflow.com/a/11074428/73226 – Martin Smith Mar 08 '19 at 13:05

2 Answers2

1

Well I am not sure by how much this will lessen the execution time but rewriting the above query as follows might help.

DELETE LT
FROM [LiveTable]  LT
LEFT JOIN [FailedDeadLine] FD1 ON LT.WorkID=FD1.WorkID
LEFT JOIN [FailedDeadline] FD2 ON LT.WorkID=FD2.WorkID 
WHERE (FD1.WorkID IS NULL OR LT.EmployeeID<>FD2.EmployeeID)
Sanjay
  • 515
  • 3
  • 8
  • Thanks for this - was the `FDL` alias in the `WHERE` clause supposed to be `FD1`? This does seem to work for the Employee ID part of the statement (and far quicker!), but it doesn't seem to remove the WorkID's that are just missing from the other table. – RazorKillBen Mar 08 '19 at 11:01
  • Just to add to this - removing the `INNER JOIN` line and just executing `WHERE FD1.WorkID IS NULL` _does_ delete the ones not in the table. Changing this to a `LEFT JOIN` _does_ delete everything in about 3 seconds! Thank you so much @Sanjay! If you can edit the answer I'll accept it! – RazorKillBen Mar 08 '19 at 11:07
  • Is the query posted in Question working properly? If yes then my answer is just refactored version of your query. Could you elaborate more about the removing WorkID's condition? – Sanjay Mar 08 '19 at 11:13
  • Thank you - so only the `FD1.WorkID IS NULL` portion executes when using `LEFT JOIN` followed by `INNER JOIN`. I have changed it to `LEFT JOIN` followed by another `LEFT JOIN` and it returns both `WHERE` clauses. – RazorKillBen Mar 08 '19 at 11:17
1

I would phrase this as:

DELETE lt FROM [LiveTable] lt
WHERE NOT EXISTS (SELECT 1
                  FROM FailedDeadline fdl
                  WHERE fdl.WorkID = lt.WorkID AND
                        lt.EmployeeID = fdl.EmployeeID
                 );

This one condition captures that the record does not exist in the failed table with the same employee id.

Then, you want an index on FailedDeadline(WorkId, EmployeeID).

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786