4

I have a really odd query involving a join to a complex view. I analyzed the heck out of the view, built some indexes, and got the query working in under a second when run from MSSQL Management Studio. However, when run from Perl via ODBC, the exact same query takes around 80 seconds to return.

I've dumped almost 8 hours into this and it continues to baffle me. In that time I've logged the query from Perl and copied it verbatim into Studio, I've wrapped it in a stored procedure (which makes it take a consistent 2.5 minutes from BOTH clients!), I've googled ODBC & MSSQL query caches, I've watched the query run via the Activity Monitor (it spends most of its time in the generic SLEEP_TASK wait state) and Profiler (the select statement gets one line which doesn't show up until it's done running), and I've started reading up on performance bottlenecks.

I haven't noticed this problem with any other queries from Perl and unfortunately we don't have a DBA on site. I'm a programmer who's done some DBA but I feel like I'm groping in the dark with this one. My best guess is that there is some sort of query cache available from Studio that the ODBC client can't access, but restarting Studio does not make the query's first execution take longer so it doesn't look like it's just because each new ODBC connection starts with an empty cache.

Without going into the view definitions, the base query is very simple:

SELECT * FROM VIEW1 LEFT OUTER JOIN VIEW2 WHERE SECTION = ? AND ID = ?

The delay goes away when I drop VIEW2, but I need the data from that view. I've already rewritten the view three times in attempts to simplify and improve efficiency but this feels kinda like a dead end since the query runs fine from Studio. The query only returns a single row but even dropping the ID criteria and selecting all 56k rows for an entire section only takes 40 seconds from Studio. Any other ideas?

Edit 2/8: The article @Remus Rusanu linked was pretty clear, but I'm afraid it didn't quite apply. It now seems pretty clear that it's not ODBC at all, but that when I hard-code arguments vs. parametrize them, I get different execution plans. I can reproduce this in SSMS:

SELECT * FROM VIEW1 LEFT OUTER JOIN VIEW2 WHERE SECTION = 'a' AND ID = 'b'

is 100 times faster than

DECLARE @p1 VARCHAR(8), @p2 VARCHAR(3)
SET @p1 = 'a'
SET @p2 = 'b'
SELECT * FROM VIEW1 LEFT OUTER JOIN VIEW2 WHERE SECTION = @p1 AND ID = @p2

Unfortunately, I'm still at a loss to explain why the first should get an execution plan that takes two orders of magnitude less time than the parametrized version, for any values of SECTION & ID I can throw at it. It may be a deficiency in SQL Server, but it just seems stupid that the inputs are known in both places yet the one takes so much longer. If SQL server recomputed the parametrized plan from scratch every time, as it must be doing for the different constant values I am supplying, it would be 100 times faster. None of the RECOMPILE options suggested by the article seem to help either.

I think @GSerg called it below. I've yet to prove that it doesn't happen with the window function used externally to the view, but he describes the same thing and the timing discrepancy remains baffling.

If I don't run out of time to work on this, I'll try to adapt some of the article's advice and force the constant execution plan on the parametrized version, but it seems like an awful lot of what should be unnecessary trouble.

Derek P
  • 51
  • 5
  • Have you, by any chance, got a window function in that view that depends on `section` or `id`? Such as `rank() over (partition by section)`. – GSerg Feb 08 '12 at 18:16
  • The delay goes away when you drop the view because you can no longer query the view - it should come back immediately with an invalid object error. Just sayin' – Aaron Bertrand Feb 08 '12 at 18:21
  • GSerg, yes. I use a window function over both: ROW_NUMBER() OVER (PARTITION BY SECTION, ID) – Derek P Feb 08 '12 at 18:24
  • @DerekP I've had a very similar issue before and after some testing could only conclude it's a bug in SQL Server 2008. A window function that depends on a `where` condition externally attached to a view when using parameters causes all rows to be fetched, calculates all windows over all possible combinations and then filters, which is ridiculous. Doesn't happen if the `where` caluse is expressed with literal values. Remove `row_number()` from the view. If it's required, apply it externally, from the calling query. – GSerg Feb 08 '12 at 18:30
  • @DerekP You're saying the query is solid 2.5 minutes from both clients when wrapped in a procedure. That is exactly it. You're executing the query from SSMS by putting literal numbers into the query string, aren't you? Such as `where section = 1 and id = 2`, as opposed to `where section = @section and id = @id`. – GSerg Feb 08 '12 at 18:43
  • Yes, you do have it. I missed what you were saying about parameters until I read the article linked by Remus Rusanu below. As my edited post now explains, it sure looks like a bug to me too. Darn. I really don't want to move that code out of the view. – Derek P Feb 08 '12 at 21:20

3 Answers3

3

As explained by Martin Smith, it's an issue with predicate pushing in SQL Server which hasn't been fixed in full.

As suggested in answers and comments in the linked question, you can either

  • Wrap the query into something that appends option(recompile)
  • Convert it to a table-valued function.
  • Attach a plan guide with option (recompile)
Community
  • 1
  • 1
GSerg
  • 76,472
  • 17
  • 159
  • 346
  • OPTION(RECOMPILE) doesn't seem to be working for me. Rewriting the view without the window function worked around the problem. – Derek P Feb 09 '12 at 13:25
1

Everything you ever wanted to know on the subject: Slow in the Application, Fast in SSMS? Understanding Performance Mysteries.

There is no 'cache' available in SSMS that ODBC cannot access. Is just that you're getting different execution plans in SSMS vs. ODBC, either because of parameter sniffing or because of data type precedence rules. Read the article linked, it has both the means to identify the problem and recommendations on how to fix it.

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
  • The article you linked was pretty clear, but I'm afraid it didn't quite apply. It now seems pretty clear that when I hard-code arguments vs. parametrize them, I get different execution plans. Unfortunately, I'm still at a loss to explain why the version with constant args should get an execution plan that takes two orders of magnitude less time than the parametrized version, for any values of SECTION & ID I can throw at it. It may be a deficiency in SQL Server, but it could obviously recompute the plan from scratch every time and return faster. Yet, the RECOMPILE options have no effect. – Derek P Feb 08 '12 at 21:04
1

Compare the plans for the two queries and check the SET settings for each connection (you can do this by looking in sys.dm_exec_sessions). I bet you'll see a difference in quoted_identifier, ansi_nulls or arithabort (or possibly all three). This usually causes vast differences in the execution plan. You should be able to set these settings manually in your ODBC version in order to match the settings that are being used by Management Studio.

Some related questions - there could be other obscure circumstances at play that you'll want to check into:

SQL Query slow in .NET application but instantaneous in SQL Server Management Studio

SQL Server Query Slow from PHP, but FAST from SQL Mgt Studio - WHY?

Query times out from web app but runs fine from management studio

SQL Server 2005 stored procedure fast in SSMS slow from VBA

Community
  • 1
  • 1
Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
  • This looks like a promising new avenue for investigation. Out of curiosity, what search terms did you use to find these links? – Derek P Feb 08 '12 at 18:30
  • 1
    I googled rather than using the search here: "query fast in management studio slow in application site:stackoverflow.com" – Aaron Bertrand Feb 08 '12 at 18:37