Efficiency of varchar(max) in T-SQL code

Question

I have a question about the "efficiency" (speed, memory usage) of (n)varchar(max) versus (n)varchar(<explicit-size>) in T-SQL code (stored procedures, functions). I am not asking about its usage in column definitions.

In existing code, I am noticing lots of occasions along the lines of, say:

CREATE TABLE [table] (
    [column] nvarchar(1000)
)

DECLARE @var nvarchar(2000)
SELECT @var = [column] FROM  TABLE WHERE ...
SET @var = @var + @somethingelse + @anotherthing ...
SET @var = REPLACE(@var, N'search', N'lotstoreplacewith')
...

The gist being (the above is only an example) I am ultimately building longer strings off limited-size string columns. It is difficult/inconvenient to know and maintain just how big these strings could become.

So, it would be simpler for me if I changed to declaring these T-SQL variables as nvarchar(max). Then I do not have to worry about what maximum size they could reach. They would be similar to a programming language's "unlimited" string size.

Does anybody know how "efficiently" SQL Server handles code which manipulates variables declared (n)varchar(max)? For example, does it reserve a huge amount of space for such a variable (I'm hoping it handles it purely dynamically), or for CAST/CONVERT(), or do string functions using it suffer some performance penalty? [If it is relevant, I have to support SQL Server 2008 R2 onward.]

[EDIT: It has been suggested that my question is a duplicate of Are there any disadvantages to always using nvarchar(MAX)?. However, all but a single post there refer to varchar(max) in column definitions, and my question explicitly states I am asking about T-SQL code (variables etc.) performance. I am posting an answer (because it is large) to my question below, which draws on that post and extends it with some new information.]

Possible duplicate of [Are there any disadvantages to always using nvarchar(MAX)?](http://stackoverflow.com/questions/148398/are-there-any-disadvantages-to-always-using-nvarcharmax) — Alessio Cantarella, Feb 16 '16 at 11:15
Given that there's no known [maximum size](http://stackoverflow.com/q/7611394/15498) for such a variable, it cannot allocate all of the storage up-front. Also, the only relevant question, surely, is whether it's efficient *enough* for *your* purposes, which only you can answer by *measuring*. — Damien_The_Unbeliever, Feb 16 '16 at 11:38
There are differences in the way SQL Server internally handles (n)varchar(n) versus (n)varchar(MAX) variables. See Erland Sommarskog article: http://www.sommarskog.se/arrays-in-sql-2005.html for some observations. — Dan Guzman, Feb 16 '16 at 12:06
Since it is for "run" only purposes you can easily (I hope) make two versions of that SP and call then thousand times to compare performance. You can even answer yourself question with the results of a acid test! hat can be interesting =) — jean, Feb 16 '16 at 12:11

score 1 · Accepted Answer · edited May 23 '17 at 12:31

In Are there any disadvantages to always using nvarchar(MAX)? there is one answer https://stackoverflow.com/a/26120578/489865 which relates to T-SQL variables performance and not column definitions.

The gist of that post is to run SELECT @var='ABC' queries returning 1,000,000 rows, assigning to variables defined as nvarchar(<size>) versus nvarchar(max).

Under SQL Server 2008 R2, I concur with the poster's findings that nvarchar(max) is 4 times slower than nvarchar(<size>) in the example. Interestingly, if it is changed to make the assignment do slightly more work as in:

SET NOCOUNT ON;

--===== Test Variable Assignment 1,000,000 times using NVARCHAR(300)
DECLARE @SomeString NVARCHAR(300),
        @StartTime DATETIME
;
 SELECT @startTime = GETDATE()
;
 SELECT TOP 1000000
        @SomeString = 'ABC' + ac1.[name] + ac2.[name]
   FROM master.sys.all_columns ac1,
        master.sys.all_columns ac2
;
 SELECT Duration    = DATEDIFF(ms,@StartTime,GETDATE())
;
GO
--===== Test Variable Assignment 1,000,000 times using NVARCHAR(4000)
DECLARE @SomeString NVARCHAR(4000),
        @StartTime DATETIME
;
 SELECT @startTime = GETDATE()
;
 SELECT TOP 1000000
        @SomeString = 'ABC' + ac1.[name] + ac2.[name]
   FROM master.sys.all_columns ac1,
        master.sys.all_columns ac2
;
 SELECT Duration    = DATEDIFF(ms,@StartTime,GETDATE())
;
GO
--===== Test Variable Assignment 1,000,000 times using VARCHAR(MAX)
DECLARE @SomeString NVARCHAR(MAX),
        @StartTime DATETIME
;
 SELECT @startTime = GETDATE()
;
 SELECT TOP 1000000
        @SomeString = 'ABC' + ac1.[name] + ac2.[name]
   FROM master.sys.all_columns ac1,
        master.sys.all_columns ac2
;
 SELECT Duration    = DATEDIFF(ms,@StartTime,GETDATE())
;
GO

(note the + ac1.[name] + ac2.[name]) then the nvarchar(max) takes only twice as long. So in practice performance hit for nvarchar(max) may be better than at first seems.

Efficiency of varchar(max) in T-SQL code

1 Answers1