3

When I try to run the following query (or even try to get the estimated execution plan), it is very slow, 10+ minutes (I stopped waiting):

SELECT TOP 1 1
    FROM File WITH (NOLOCK) 
WHERE FileID = '5021ECD3-2012-62E0-1F12-000041CCB35A' 
    AND FileSize > 0
    AND Content IS NULL

However, the following is instant:

DECLARE @test varbinary(1)

SELECT TOP 1 @test = Content
    FROM File WITH (NOLOCK) 
WHERE FileID = '5021ECD3-2012-62E0-1F12-000041CCB35A' 
    AND FileSize > 0

IF @test IS NULL
    PRINT 'a'

There is a unique index on FileID. Content is a varbinary(max). Why is the optimizer not using it first? That would be pretty logicial, no? It seems it is checking all rows in the table File if the content is null or not, and then applying the WHERE FileID = '....'

Paul
  • 1,041
  • 11
  • 26
Kevin Doyon
  • 3,464
  • 2
  • 33
  • 38
  • Is the TOP 1 1 for a reason, say you could be using EXISTS? and Is Content actually NULLable? And can you add query plans. Also, are FileID and FileSize indexed? – gbn Jun 05 '13 at 14:18
  • 4
    What version of SQL? For later versions you could try adding a nonclustered index with a where null statement, so as to only index the rows with this column null'd. – Paul Jun 05 '13 at 14:30
  • @gbn Content is nullable. Yes, I was using this query in a `IF EXISTS` and it was incredibly slow. I'll try to get a query plan.. I never actually waited until the query ended @Paul this is on SQL Server 2008. Yes, adding an index would be my last resort :/ it just takes forever and locks the table :( – Kevin Doyon Jun 05 '13 at 14:37
  • @gbn I really don't understand what SQL Server is doing. I tried changing to FileID to something non-existant to see what it was doing. It was instant. I switched back to the existing FileID, and it became instant. Yet I have another tab doing the very same query that is still executing after 13 minutes and trying to generate the query plan.. doesn't seem to be blocked by anything according to sp_who either. Anyway, here's the instataneous query plan: http://i.imgur.com/hIlS8FI.png – Kevin Doyon Jun 05 '13 at 14:50
  • See @Paul comment. One index on the 3 columns will fix it – gbn Jun 05 '13 at 14:58
  • alright... seems like a workaround to me though. I guess I'll never trust SQL Server again! – Kevin Doyon Jun 05 '13 at 15:01
  • Bear in mind varbinary(max) is stored off table. Thus once you include it in your where statement it has to interrogate all that addittional data. Your 2nd query runs fast as it doesn't do any of that, it stays on your nice neat max-8KB-per-row table. – Paul Jun 05 '13 at 15:01
  • It should be the other way around, no? The 2nd query would have to go fetch the data as I'm puting the first byte of the content in a variable (and yet it's faster). The first one could maybe use some optimization (like using the null bitmap or whatever). I actually got the result of the slow execution. It took 25 minutes, yet it's the same execution plan as the one I linked... It did a key lookup for the content for a single row, I still don't understand how that took 25 minutes (when another query did the exact same thing instantly at the 13 minutes mark). guess I'll keep the 2nd query hack – Kevin Doyon Jun 05 '13 at 15:14
  • The 2nd query goes off table once, for the top value of the table. The 1st query goes off table for each row until it finds one that is null. – Paul Jun 05 '13 at 15:47
  • The execution plan said otherwise - there was only one key lookup in that 25 minute query, using the clustered index and a predicate on FileSize>0. Now it's instant. That table is 200+ GB, so I doubt it's really looking at each content. Maybe when it took 25 minutes, but even then, the execution plan said otherwise.. o_O – Kevin Doyon Jun 05 '13 at 15:55
  • It was the actual plan – Kevin Doyon Jun 05 '13 at 16:20
  • Plans change. Your queries are not directly comparable for the reasons I stated – Paul Jun 19 '13 at 11:29
  • Can this be related to http://stackoverflow.com/questions/3775443/strategies-for-checking-isnull-on-varbinary-fields? – David Airapetyan Oct 17 '13 at 01:55

1 Answers1

2

Well I suspect if you do a query plan on your first query, then the plan will be doing a table scan looking for content = NULL, probably without an index? On the other hand the second query is doing a lookup on FileID only, which once its found will return. Of course if content for (the first) FileId = '5021ECD3-2012-62E0-1F12-000041CCB35A' is not null, then the second query will continue looking till the end of the table. Is there a unique constraint on FileID? If not then you have told the query optimiser ther may be another row with this fileID somewhere in the table.... so you cant blame it for looking. As a famous explorer said when leaving a tent.... I might be some time....

If your not sure where query plans are then in SSMS, Query, show query plan

HTH

Ian P
  • 1,724
  • 1
  • 10
  • 12
  • Both have the same execution plan (except one has an additional filter for the `CONTENT IS NULL`) http://i.imgur.com/nJ7gjXa.png. The difference is that sometimes, the first query takes 25 minutes to complete (and I checked, it's the same execution plan as the one in the image). The second one is always quick. – Kevin Doyon Jun 05 '13 at 15:22
  • Your query plan shows both queries have equal weight 50% so are taking equal time. Can you get a query plan which reproduces the inequality of execution time? This is starting to look like the tests are not quite doing the original queries. I'd fire up profiler and double check what is actually being sent, even to the point of using a file comparison tool to check the two statements against each other. Also is the database in use by other users? – Ian P Jun 06 '13 at 07:25
  • Maybe SQL is choosing an inappropriate query plan every so ofter, so two thoughts: are your stats uptodate? Also try the "with recompile" option in the badly performing option. SQL uses a technique called parameter sniffing, "with recompile" stops this. Prameter sniffing is usually helpful, but if you have very old stats or bizarre data distribution it can cause an otherwise well performing query to go down a blind alley performance wise. – Ian P Jun 06 '13 at 08:33
  • Sorry missed the bit where you said you'd checked the plan earlier. Think the parameter sniffing more likely now. – Ian P Jun 06 '13 at 08:36
  • I'm not sure about paremeter sniffing? The query plans are exactly the same wether the query is instant or takes 25 minutes :( – Kevin Doyon Jun 07 '13 at 19:31
  • Seems to me that you are saying (now) the query runs fine most of the time but ocassionally runs slow. I would certainly try with remcompile option. Also dont seem to have a reposonse on how uptodate your stats are? – Ian P Jun 08 '13 at 11:06