There are problems with the two top voted answers. The answer recommending DATALENGTH
is prone to programmer errors. The result of DATALENGTH
must be divided by the 2 for NVARCHAR
types, but not for VARCHAR
types. This requires knowledge of the type you're getting the length of, and if that type changes, you have to diligently change the places you used DATALENGTH
.
There is also a problem with the most upvoted answer (which I admit was my preferred way to do it until this problem bit me). If the thing you are getting the length of is of type NVARCHAR(4000)
, and it actually contains a string of 4000 characters, SQL will ignore the appended character rather than implicitly cast the result to NVARCHAR(MAX)
. The end result is an incorrect length. The same thing will happen with VARCHAR(8000).
What I've found works, is nearly as fast as plain old LEN
, is faster than LEN(@s + 'x') - 1
for large strings, and does not assume the underlying character width is the following:
DATALENGTH(@s) / DATALENGTH(LEFT(LEFT(@s, 1) + 'x', 1))
This gets the datalength, and then divides by the datalength of a single character from the string. The append of 'x' covers the case where the string is empty (which would give a divide by zero in that case). This works whether @s
is VARCHAR
or NVARCHAR
. Doing the LEFT
of 1 character before the append shaves some time when the string is large. The problem with this though, is that it does not work correctly with strings containing surrogate pairs.
There is another way mentioned in a comment to the accepted answer, using REPLACE(@s,' ','x')
. That technique gives the correct answer, but is a couple orders of magnitude slower than the other techniques when the string is large.
Given the problems introduced by surrogate pairs on any technique that uses DATALENGTH
, I think the safest method that gives correct answers that I know of is the following:
LEN(CONVERT(NVARCHAR(MAX), @s) + 'x') - 1
This is faster than the REPLACE
technique, and much faster with longer strings. Basically this technique is the LEN(@s + 'x') - 1
technique, but with protection for the edge case where the string has a length of 4000 (for nvarchar) or 8000 (for varchar), so that the correct answer is given even for that. It also should handle strings with surrogate pairs correctly.