5

I have a sql query that I tried executing (below) that took 10 seconds to run, and since it was on a production environment I stopped it just to be sure there is no sql locking going on

SELECT TOP 1000000 *
  FROM Table T
  Where CONVERT(nvarchar(max), T.Data) like '%SearchPhrase%' --T.Data is initially XML

Now if I add an order by on creation time (which I do not believe is an index), it takes 2 seconds and is done.

SELECT TOP 1000000 *
  FROM Table T
  Where CONVERT(nvarchar(max), T.Data) like '%SearchPhrase%' --T.Data is initially XML
  order by T.CreatedOn asc

Now the kicker is that only about 3000 rows are returned, which tells me that even with the TOP 1000000 it isn't stopping short on which rows it's still going through all the rows.

I have a basic understanding of how SQL server works and how the query parsing works, but I'm just confused as to why the order by makes it so much faster in this situation.

The server being run is SQL server 2008 R2

Dan Drews
  • 1,966
  • 17
  • 38
  • 1
    Look at the `EXPLAIN PLAN` from each of the queries and see what the difference is in how they do things. – Ken White Jun 04 '13 at 13:40
  • 1
    Do the two results always return in 10 seconds and 2 seconds? – Kane Jun 04 '13 at 13:40
  • @Kane it's right around there. I actually never let the first one finish because I didn't want the server to lock out resources **EDIT** I let the first one run fully and it took 20 seconds (took 20 twice in a row) – Dan Drews Jun 04 '13 at 13:42
  • @KenWhite The only difference in the Execution Plan is `Parallelism (Gather Streams)` which costs 3% – Dan Drews Jun 04 '13 at 13:43
  • Just a thought, if this is an XMLColumn, why not create an XML Index and query it as XML? http://social.msdn.microsoft.com/Forums/en-US/sqlsearch/thread/b87e098b-025c-40a1-96ad-85a14ebe6303/ – BaconSah Jun 04 '13 at 13:53
  • 1
    @DanDrews . . . The problem is the parallel execution plan. This blog posting may explain what is happening (http://sqlblog.com/blogs/paul_white/archive/2012/05/03/parallel-row-goals-gone-rogue.aspx). – Gordon Linoff Jun 04 '13 at 13:57
  • 1
    Can you post both actual execution plans? (XML version not a picture) – Martin Smith Jun 04 '13 at 14:00
  • In management studio run both together with the "Include Actual Execution Plan" option turned on. Right click on the plan and select the "Show XML" option and copy and paste the whole XML up to a site like pastebin (chances are it will be too verbose to add to the question itself) – Martin Smith Jun 04 '13 at 14:03
  • Although its possible to speculate on why those two queries might perform differently, its much better to get a [query execution plan](http://stackoverflow.com/questions/7359702/how-do-i-obtain-a-query-execution-plan) which will tell you why. – Justin Jun 04 '13 at 14:06
  • With Order By: http://pastebin.com/a1em1dGk Without Order By: http://pastebin.com/t2HqDwKb – Dan Drews Jun 04 '13 at 14:12
  • 2
    The faster one (with order by) is parallel. The one without (slower) is serial. Perhaps unsurprising that the parallel one completes in less elapsed time as it is using multiple processors. It would be better to provide the actual execution plans not estimated though. Also you can use `SET STATISTICS TIME ON` to see total CPU time used across all CPUs. – Martin Smith Jun 04 '13 at 14:22
  • @MartinSmith Okay, that's a good explanation, if you change this to an answer, I'll mark it as correct – Dan Drews Jun 04 '13 at 14:41
  • @DanDrews - Could you run the fast one first and look at the XML in the **actual** execution plan (doesn't show up in the estimated) and see what the `DegreeOfParallelism` is? – Martin Smith Jun 04 '13 at 14:43
  • DegreeOfParallelism is 24 – Dan Drews Jun 04 '13 at 14:48

1 Answers1

5

The additional sort operation is apparently enough in this case for SQL Server to use a parallel plan.

The slower one (without ORDER BY) is a serial plan whereas the faster one has a DegreeOfParallelism of 24 meaning that the work is being done by 24 threads rather than just a single one.

This explains the much reduced elapsed time despite the additional work required for the sort.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845