3

I need to measure the duration of several queries, like this:

declare @dtStart1 as datetime;
declare @dtStart2 as datetime;
declare @dtStart3 as datetime;
declare @dtStart4 as datetime;
declare @dtStart5 as datetime;
declare @dtStart6 as datetime;
declare @dtStart7 as datetime;
declare @dtStart8 as datetime;
declare @dtStart9 as datetime;
declare @dtStart10 as datetime;
declare @duration1 as int;
declare @duration2 as int;
declare @duration3 as int;
declare @duration4 as int;
declare @duration5 as int;
declare @duration6 as int;
declare @duration7 as int;
declare @duration8 as int;
declare @duration9 as int;
declare @duration10 as int;

set @dtStart1 = (select getutcdate());
--query1
set @duration1 = (select datediff(millisecond, @dtStart1, GETUTCDATE()));

set @dtStart2 = (select getutcdate());
--query2
set @duration2 = (select datediff(millisecond, @dtStart2, GETUTCDATE()));

set @dtStart3 = (select getutcdate());
--query3
set @duration3 = (select datediff(millisecond, @dtStart3, GETUTCDATE()));

set @dtStart4 = (select getutcdate());
--query4
set @duration4 = (select datediff(millisecond, @dtStart4, GETUTCDATE()));

set @dtStart5 = (select getutcdate());
--query5
set @duration5 = (select datediff(millisecond, @dtStart5, GETUTCDATE()));

set @dtStart6 = (select getutcdate());
--query6
set @duration6 = (select datediff(millisecond, @dtStart6, GETUTCDATE()));

set @dtStart7 = (select getutcdate());
--query7
set @duration7 = (select datediff(millisecond, @dtStart7, GETUTCDATE()));

set @dtStart8 = (select getutcdate());
--query8
set @duration8 = (select datediff(millisecond, @dtStart8, GETUTCDATE()));

set @dtStart9 = (select getutcdate());
--query9
set @duration9 = (select datediff(millisecond, @dtStart9, GETUTCDATE()));

set @dtStart10 = (select getutcdate());
--query10
set @duration10 = (select datediff(millisecond, @dtStart10, GETUTCDATE()));

select @duration1 / 1000.0 as q1,
       @duration2 / 1000.0 as q2,
       @duration3 / 1000.0 as q3,
       @duration4 / 1000.0 as q4,
       @duration5 / 1000.0 as q5,
       @duration6 / 1000.0 as q6,
       @duration7 / 1000.0 as q7,
       @duration8 / 1000.0 as q8,
       @duration9 / 1000.0 as q9,
       @duration10 / 1000.0 as q10;

The problem is that besides the results I am actually interested in I get the results of the queries as well. I have tried using a cursor for each query, but it was instant even for long queries. Possibly it was just defined but not executed. I have tried setting FMTONLY to ON and then to OFF, but that was instant as well for long queries and has even shown the column names in the result. I would like to achieve executing the queries, getting the durations just like in the case the queries were normally unning, but not giving it back to my application server, where it will be problematic to handle the millions of records the queries might return, not to mention the huge waste of memory compared to the ideal result of getting just the one row that I am interested in, namely the results.

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175

2 Answers2

3

Few options come to mind.

  1. One obvious way to suppress the query result set is to insert result of the query into a #temp table and then drop the temp table. This will affect the query run time, but relatively easy to implement. Simply add INTO #temp clause after the SELECT of your query. The calling application doesn't need to change.

  2. Change the calling application and make it expect these result sets. Measure the "time-to-first-row" and once application receives the first row stop the query. It would be a rather significant task to implement.

  3. Change the query, so that its results are stored in variables, not temp table. One variable per column.


Note: As Martin Smith pointed out in the comments, assigning column values into variables may change the shape of the plan, as shown in his answer to the question: sql execution latency when assign to a variable, so you should use option 3 with caution.


For example, if you have a query

SELECT 
    Col1
    ,Col2
    ,Col3
FROM YourTable
... some complex logic
;

Change it to the following:

DECLARE @VarCol1 bigint;
DECLARE @VarCol2 int;
DECLARE @VarCol3 datetime2(0);
-- use appropriate types that match the query columns

SELECT 
    @VarCol1 = Col1
    ,@VarCol2 = Col2
    ,@VarCol3 = Col3
FROM YourTable
... some complex logic
;

Such query will run in full (as opposed to wrapping the query in SELECT COUNT(*)), but its results will be stored in the local variables. Each new row will overwrite the variable values, but it should be less overhead than using #temp table.

You can easily verify and compare methods 1 and 3 by adding

SET STATISTICS TIME ON;
SET STATISTICS IO ON; 

before the query and

SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;

after the query.

Try to run your original query, query with saving result into the #temp table, query with saving result into the variables and compare the CPU and reads.

In my tests the number of reads was the same for normal query and query that saved results into variables. The query with variables was much faster in elapsed time, but had similar CPU time, because there was no network traffic.

The query that saved results into temp table had more reads and was a bit slower than query that saved results into variables.

I have a large table and my test query simply reads 1M rows from it:

SELECT 
TOP (1000000)
[ID]
,[ElevatorID]
,[TimestampUTC]
FROM [dbo].[ArchivePlaybackStatsDay];


DECLARE @VarID bigint;
DECLARE @VarElevatorID int;
DECLARE @VarTimestampUTC int;

SELECT 
TOP (1000000)
@VarID = [ID]
,@VarElevatorID = [ElevatorID]
,@VarTimestampUTC = [TimestampUTC]
FROM [dbo].[ArchivePlaybackStatsDay];


SELECT 
TOP (1000000)
[ID]
,[ElevatorID]
,[TimestampUTC]
INTO #Temp
FROM [dbo].[ArchivePlaybackStatsDay];

DROP TABLE #Temp;

I ran it in SQL Sentry Plan Explorer and got these stats:

stats

You can see that Reads of the first and second rows are the same, CPU is close, but Duration is very different, because first query actually transfers 1M rows to the client. The third query that uses #temp table has some extra overhead compared to the second query that uses variables.


I added another variant that converts all columns into varbinary variable to unify variable declarations. Unfortunately, conversion into varbinary and especially varbinary(max) had a noticeable overhead.

DECLARE @VarBin varbinary(8000);

SELECT 
TOP (1000000)
@VarBin = [ID]
,@VarBin = [ElevatorID]
,@VarBin = [TimestampUTC]
FROM [dbo].[ArchivePlaybackStatsDay];

DECLARE @VarBinMax varbinary(max);

SELECT 
TOP (1000000)
@VarBinMax = [ID]
,@VarBinMax = [ElevatorID]
,@VarBinMax = [TimestampUTC]
FROM [dbo].[ArchivePlaybackStatsDay];

stats2

Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • Very nice answer! There is still a problem that we need to think about. Since the queries might be many and dynamic, there will be some difficulties in generating the variable assignments into the selection, but that is manageable. However, we cannot know in advance how many columns will a random selection have, so having as many variables seems to be unfeasible. Also, we have a problem with types. I think about having a single variable of a long varchar and using it for every column, converting/casting all values to the corresponding type. – Lajos Arpad Sep 01 '17 at 20:03
  • Well, once you know the main idea you can adjust it to your specific case. Your goal is to measure the performance, so make sure that you keep all columns in the `SELECT`. If you omit some columns from the `SELECT` you may get a different plan with a very different performance. Also, I would avoid introducing LOB types (such as `varchar(max)`) if they were not in the query originally. LOB types usually have extra overhead compared to "normal" types, like `varchar(8000)`. – Vladimir Baranov Sep 01 '17 at 23:49
  • On the other hand, you may try to convert all columns to `varbinary` to unify your types. Compare performance of `varbinary(max)` vs `varbinary(8000)`. Since it will be a variable, not a column, performance may be quite similar. ... In my quick test conversion into `varbinary(8000)` had a noticeable overhead. Conversion into `varbinary(max)` had even larger overhead. Most likely conversion into `varchar` would have overhead as well. You need to test on your system. – Vladimir Baranov Sep 01 '17 at 23:55
  • I have successfully used your idea and refined it a little bit. I have a new question about this problem. If you are interested, you can find it at https://stackoverflow.com/questions/46121171/how-to-put-all-my-selected-columns-into-a-dummy-variable – Lajos Arpad Sep 08 '17 at 16:48
  • 1
    Regarding option 3 that usually gives the same plan but not always https://stackoverflow.com/q/44255340/73226 – Martin Smith Sep 08 '17 at 17:40
  • Thank you @MartinSmith, excellent find. Learnt something new today. – Vladimir Baranov Sep 09 '17 at 00:20
1

You may try to get very approx measure by

declare @dummyCounter as int;
set @dummyCounter = (Select count(*) 
from ( 
/* original query */ 
) t);

Definetly it may have a different plan from the original one

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
Serg
  • 22,285
  • 5
  • 21
  • 48
  • You mean declare \@dummyCounter as int; and then set \@dummyCounter = (select count(*) from (...) t); right? – Lajos Arpad Aug 31 '17 at 13:20
  • I have edited and upvoted your answer, but I am afraid I cannot accept it. A query returning a million record runs for many minutes by itself, but if it is wrapped into select count(*) from (...) t, then it is instant (1.23 second, to be exact). Which means that the mentioned query plan change will result in huge measurement accuracy drop, which renders the answer unacceptable in this context, but it is a good, intuitive idea which might be helpful in other problems, so kudos to you for the idea. – Lajos Arpad Aug 31 '17 at 13:28
  • 2
    @LajosArpad, yeah, looks like QA is smart enough to find a way to count the rows without instantiating the whole intermediate result set. Need to compare plans first. – Serg Aug 31 '17 at 13:33