44

By better, I mean does it improve performance by some non-marginal amount?

That is to say, each time I call GETDATE(), what amount of work does the server do to return that value?

If I'm using GETDATE() in many places in a stored procedure, should I instead be creating a variable to store the date of the transaction?

declare @transDate datetime = GETDATE()

Bench-marking data would be fantastic.

EDIT I want to clarify: I'm interested mainly in the actual performance differences between these two possibilities, and whether or not it is significant.

Shmiddty
  • 13,847
  • 1
  • 35
  • 52
  • 10
    Performance difference will be extremely negligible or non-existent. The difference I'd be worried about is the accuracy of the results if GETDATE() is used in contexts where it is re-evaluated. – Aaron Bertrand Aug 22 '12 at 17:32
  • Thank you for clearing that up for me. As I mentioned, I'm interested mainly in the performance. When accuracy is critical, using a variable is pretty clearly the best option. – Shmiddty Aug 22 '12 at 17:37
  • 3
    When is accuracy not important? – Aaron Bertrand Aug 22 '12 at 19:05
  • 2
    @aaron: there is a huge performance difference when queriing many rows, see the accepted answer below, it also exactly reflects our observations. – eFloh Jul 18 '13 at 11:48
  • @eFloh that is incorrect, sorry. When you have *more than one* reference the runtime constant that is evaluated can be different, but that really has nothing to do with determinism. – Aaron Bertrand Jul 18 '13 at 12:14
  • @eFloh can you show an actual use case where using *a single instance* of `GETDATE()` in the query performed demonstrably worse than using a variable? – Aaron Bertrand Jul 18 '13 at 12:15
  • @aaron: Image a huge Tranasction log table with a date column. select t.articleName, t.date, s.name from transactionLog t inner join suppliers s on t.supplierId=s.Id where t.date <= DATEADD(month,-9, GETDATE()). And it has to do with determinismn in the meaning used in MS SQL Server. – eFloh Jul 22 '13 at 14:40
  • @eFloh you're not answering my question. Can you *demonstrate* that in that case `GETDATE()` is evaluated multiple times? – Aaron Bertrand Jul 22 '13 at 14:54
  • @aaron: I can proof that in my example stated, the performance changes from "execution timeout" to "under a second" in our customer databases when replacing the GETDATE() with a date literal or command variable. You actually are right stating that this is not because of nondeterminism. GETDATE ist a non-deterministic runtime constant scalar function and thus only evaluated once per Query and column (except there is [a bug in sql server](http://sqlblog.com/blogs/andrew_kelly/archive/2008/02/27/when-getdate-is-not-a-constant.aspx). Nevertheless, there is a huger performance difference. – eFloh Jul 24 '13 at 15:24
  • @eFloh show the queries and plans (you can use sqlfiddle, pastebin etc.) or it didn't happen. – Aaron Bertrand Jul 24 '13 at 15:27
  • 2
    ok, I'll stop here. This is not about proofing if I can reproduce anything, but about helping the OP. My experience concerning the question is stated and there are enough good comments below. I have better things to do than doing a flame war in the comments here. – eFloh Jul 25 '13 at 10:08

5 Answers5

24

[NOTE: If you are going to downvote this answer, please leave a comment explaining why. It has already been downvoted many times, and finally ypercube (thank you) explained at least one reason why. I can't remove the answer because it is accepted, so you might as well help to improve it.]

According to this exchange on Microsoft, GETDATE() switched from being constant within a query to non-deterministic in SQL Server 2005. In retrospect, I don't think that is accurate. I think it was completely non-deterministic prior to SQL Server 2005 and then hacked into something called "non-deterministic runtime constant" since SQL Server 2005". The later phrase really seems to mean "constant within a query".

(And GETDATE() is defined as unambiguously and proudly non-deterministic, with no qualifiers.)

Alas, in SQL Server, non-deterministic does not mean that a function is evaluated for every row. SQL Server really does make this needlessly complicated and ambiguous with very little documentation on the subject.

In practice the function call is evaluated when the query is running rather than once when the query is compiled and its value changes each time it is called. In practice, GETDATE() is only evaluated once for each expression where it is used -- at execution time rather than compile time. However, Microsoft puts rand() and getdate() into a special category, called non-deterministic runtime constant functions. By contrast, Postgres doesn't jump through such hoops, it just calls functions that have a constant value when executed as "stable".

Despite Martin Smith's comment, SQL Server documentation is simply not explicit on this matter -- GETDATE() is described as both "nondeterministic" and "non-deterministic runtime constant", but that term isn't really explained. The one place I have found the term , for instance, the very next lines in the documentation say not to use nondeterministic functions in subqueries. That would be silly advice for "nondeterministic runtime constant".

I would suggest using a variable with a constant even within a query, so you have a consistent value. This also makes the intention quite clear: You want a single value inside the query. Within a single query, you can do something like:

select . . . 
from (select getdate() as now) params cross join
     . . . 

Actually, this is a suggestion that should evaluate only once in the query, but there might be exceptions. Confusion arises because getdate() returns the same value on all different rows -- but it can return different values in different columns. Each expression with getdate() is evaluated independently. This is obvious if you run:

select rand(), rand()
from (values (1), (2), (3)) v(x);

Within a stored procedure, you would want to have a single value in a variable. What happens if the stored procedure is run as midnight passes by, and the date changes? What impact does that have on the results?

As for performance, my guess is that the date/time lookup is minimal and for a query occurs once per expression as the query starts to run. This should not really a performance issue, but more of a code-consistency issue.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • This best answers my actual question. Though, I'd still be interested in seeing some benchmark results. – Shmiddty Aug 22 '12 at 18:42
  • 3
    @Shmiddty can't *you* run some performance tests to see how the difference can affect your workload? In fact I'd say you are in a much better position to do that than any of us. Though I don't see why the benchmark matters - if the unsafe method is 0.1 milliseconds faster, are you going to use that instead? – Aaron Bertrand Aug 22 '12 at 19:03
  • 17
    `GETDATE()` was never deterministic. Deterministic means that it will always return the same result when passed the same parameters. See the [list of deterministic functions in 2000](http://msdn.microsoft.com/en-us/library/aa214775(v=sql.80).aspx). `GETDATE` is a [Runtime Constant Function](http://blogs.msdn.com/b/conor_cunningham_msft/archive/2010/04/23/conor-vs-runtime-constant-functions.aspx). A single `GETDATE()` invocation will not return a different result per row regardless of length of query execution though you can wrap in a UDF to get this effect. – Martin Smith Aug 26 '12 at 14:15
  • 5
    Different `GETDATE()` references in the same query can return different results `WHILE DATEDIFF(ms, GETDATE() , GETDATE()) = 0 PRINT 'This will not run in an infinite loop'` does not run in an infinite loop in 2000/2005/2008. – Martin Smith Aug 26 '12 at 14:16
  • I tried the cross apply and it did not improve my response time. – pghcpa Sep 09 '16 at 00:56
  • 5
    It concerns me that this accepted and highly upvoted answer appears to be wrong, while the [comment by Marthin Smith](http://stackoverflow.com/questions/12078202/when-using-getdate-in-many-places-is-it-better-to-use-a-variable#comment16223903_12078717) seems to be correct. – GSerg Mar 08 '17 at 15:45
  • 1
    @GSerg . . . Despite Martin's comment, I have not found *SQL Server documentation* that specifies that the function is a run-time constant. It is clear that the function is often evaluated only once during execution, but that is not the same as a run-time constant function. If there is such documentation, I would be interested in it. – Gordon Linoff Apr 11 '17 at 14:45
  • @MartinSmith . . . "constant within a query" is not deterministic. Postgres calls such functions "stable"; I don't believe SQL Server has terminology for such functions. There is a difference between a function that is *optimized* to be called only once in most queries versus one that returns the same value each time it is called. – Gordon Linoff Apr 11 '17 at 14:49
  • 3
    @GordonLinoff http://dba.stackexchange.com/questions/18459/does-sql-server-evaluate-functions-once-for-every-row – ypercubeᵀᴹ Apr 11 '17 at 15:24
  • 2
    Also, Conor Cunningham blog post: [RAND() and other runtime constant functions, redux](http://www.sqlskills.com/blogs/conor/rand-and-other-runtime-constant-functions-redux/) – ypercubeᵀᴹ Apr 11 '17 at 15:26
  • 1
    And a miniscule reference in official docs: https://technet.microsoft.com/en-us/library/ms190642(v=sql.105).aspx ("*Aside from **runtime constant nondeterministic** built-ins RAND, and the **GETDATE** family, ...*") – ypercubeᵀᴹ Apr 11 '17 at 15:27
  • 3
    and just in case you haven't heard of him: [Conor Cunningham profile](https://sqlbits.com/Speakers/Conor_Cunningham) – ypercubeᵀᴹ Apr 11 '17 at 15:32
  • 2
    @ypercubeᵀᴹ . . . I find the blog post quite ambiguous. The exact statement is: "One that comes to mind is getdate() – it’s non-deterministic and you likely don’t want the results to change during a long-running query." So how exactly is it implemented? This type of information really should be quite explicit in the documentation, and not buried in an blog post. I should emphasize that `GETDATE()` is proudly the prototypical nondeterministic function in another part of the documentation -- the reference is in the revised answer. – Gordon Linoff Apr 11 '17 at 15:50
  • 2
    I certainly agree with you though that the documentation is not clear at this point. Several people have **deduced** - with various methods (debugging, execution plans, timing, discussions with MS developers) - that it acts as a runtime constant. My conclusion is that it does so, since at least 6 recent SQL Server versions - but it didn't in older versions, 2000 or earlier (so opposite to your claim in the answer) and the docs have not been updated. – ypercubeᵀᴹ Apr 11 '17 at 15:52
  • 2
    As for nondetermnistic, of course. It never was - and could not be - deterministic. – ypercubeᵀᴹ Apr 11 '17 at 15:57
18

My suggestion would be to use a variable mainly because if you have a long-running process, the GetDate() value might be different between calls.

Unless you are only using the Date part of GetDate() then you will be sure you are always using the same value.

Taryn
  • 242,637
  • 56
  • 362
  • 405
  • 1
    I would also suggest using UTC (via `GETUTCDATE()`) wherever possible. It's a tiny bit faster, and helps you avoid a lot of time-related issues. – Michael Haren Aug 22 '12 at 17:22
  • 8
    I dont think you can depend on getting the Date part as well, because you can start the query at 11:59pm which might span for 10 mins and get another date at 12:01am next day. – Kash Aug 22 '12 at 17:24
  • But, in the case that it isn't critical that each instance of `GetDate()` return the exact same value, are there other benefits to using a variable? – Shmiddty Aug 22 '12 at 17:25
  • 1
    @Kash that is very true as well, I would always use a variable in this type of case. Much better then making multiple calls. – Taryn Aug 22 '12 at 17:25
  • @Shmiddty my personal suggestion would be to use a variable, then in the event you ever need to alter the value for all queries you only have to do it in one place instead of multiple. – Taryn Aug 22 '12 at 17:26
3

One reason to use a variable with getdate() or functions like suser_sname() is a huge performance difference if you are inserting rows, or if you are doing a GROUP BY. You will notice this if you insert large amount of rows.

I suffered this myself migrating 300GB of data to several tables.

d219
  • 2,707
  • 5
  • 31
  • 36
MarianoC
  • 31
  • 1
2

I was testing on a couple of stored procedures using the GETDATE() function as a variable within an SP and I was having increase on IO reads and execution time due to the fact that query optimizer does not know what's the value to operate read this Stored Procedure Execution with Parameters, Variables, and Literals , with that said you can use the GETDATE() function in every single part of the SP as @Gordon Linoff mentioned its value does not change during execution or in order to avoid/remove the thinking that the value might change I did create a parameters this way:

CREATE PROC TestGetdate
(
@CurrentDate DATETIME = NULL
)
AS
SET CurrentDate  = GETDATE()

..... and then use the parameters as you see fit, you'll see good results

Any comments or suggestions are welcome.

Marcello Miorelli
  • 3,368
  • 4
  • 44
  • 67
jthalliens
  • 504
  • 4
  • 14
0

I used

WHERE ActualDateShipped+30 > dbo.Today()

in combination with function below. Brought my query time from 13 seconds to 2 seconds. No prior answers in this post helped this problem in SQL 2008/R2.

CREATE FUNCTION [dbo].[Today]()

    RETURNS date
    AS
    BEGIN

        DECLARE @today date = getdate()

        RETURN @today
    End
pghcpa
  • 833
  • 11
  • 25