18

If I deliberately store trailing spaces in a VARCHAR column, how can I force SQL Server to see the data as mismatch?

SELECT 'foo' WHERE 'bar' = 'bar    '

I have tried:

SELECT 'foo' WHERE LEN('bar') = LEN('bar    ')

One method I've seen floated is to append a specific character to the end of every string then strip it back out for my presentation... but this seems pretty silly.

Is there a method I've overlooked?

I've noticed that it does not apply to leading spaces so perhaps I run a function which inverts the character order before the compare.... problem is that this makes the query unSARGable....

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Matthew
  • 10,244
  • 5
  • 49
  • 104

6 Answers6

10

From the docs on LEN (Transact-SQL):

Returns the number of characters of the specified string expression, excluding trailing blanks. To return the number of bytes used to represent an expression, use the DATALENGTH function

Also, from the support page on How SQL Server Compares Strings with Trailing Spaces:

SQL Server follows the ANSI/ISO SQL-92 specification on how to compare strings with spaces. The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them.

Update: I deleted my code using LIKE (which does not pad spaces during comparison) and DATALENGTH() since they are not foolproof for comparing strings

This has also been asked in a lot of other places as well for other solutions:

KyleMit
  • 30,350
  • 66
  • 462
  • 664
  • I think `DATALENGTH` will work well for me here, or storing a calculated `DATALENGTH` in a column to make it faster at the expense of a byte. – Matthew Oct 15 '10 at 15:33
  • 2
    This doesn't answer the question as written. And it wrongly claims that it's a duplicate of questions about the behavior which this question *explicitly* asks to avoid. – Kenny Evitt Jun 29 '18 at 20:22
4

After some search the simplest solution i found was in Anthony Bloesch WebLog.

Just add some text (a char is enough) to the end of the data (append)

SELECT 'foo' WHERE 'bar' + 'BOGUS_TXT' = 'bar    ' + 'BOGUS_TXT'

Also works for 'WHERE IN'

SELECT <columnA>
FROM <tableA>
WHERE <columnA> + 'BOGUS_TXT' in ( SELECT <columnB> + 'BOGUS_TXT' FROM <tableB> )
Nickolay
  • 31,095
  • 13
  • 107
  • 185
  • 1
    Thanks for the link! The article also explains why you might prefer this over `DATALENGTH` (correctness in presence of Unicode combining marks). – Nickolay Jan 21 '21 at 19:04
4

you could try somethign like this:

declare @a varchar(10), @b varchar(10)
set @a='foo'
set @b='foo   '

select @a, @b, DATALENGTH(@a), DATALENGTH(@b)
DForck42
  • 19,789
  • 13
  • 59
  • 84
4

Sometimes the dumbest solution is the best:

SELECT 'foo' WHERE 'bar' + 'x' = 'bar     ' + 'x'

So basically append any character to both strings before making the comparison.

Extragorey
  • 1,654
  • 16
  • 30
2

The approach I’m planning to use is to use a normal comparison which should be index-keyable (“sargable”) supplemented by a DATALENGTH (because LEN ignores the whitespace). It would look like this:

DECLARE @testValue VARCHAR(MAX) = 'x';

SELECT t.Id, t.Value
FROM dbo.MyTable t
WHERE t.Value = @testValue AND DATALENGTH(t.Value) = DATALENGTH(@testValue)

It is up to the query optimizer to decide the order of filters, but it should choose to use an index for the data lookup if that makes sense for the table being tested and then further filter down the remaining result by length with the more expensive scalar operations. However, as another answer stated, it would be better to avoid these scalar operations altogether by using an indexed calculated column. The method presented here might make sense if you have no control over the schema , or if you want to avoid creating the calculated columns, or if creating and maintaining the calculated columns is considered more costly than the worse query performance.

Kenny Evitt
  • 9,291
  • 5
  • 65
  • 93
binki
  • 7,754
  • 5
  • 64
  • 110
1

I've only really got two suggestions. One would be to revisit the design that requires you to store trailing spaces - they're always a pain to deal with in SQL.

The second (given your SARG-able comments) would be to add acomputed column to the table that stores the length, and add this column to appropriate indexes. That way, at least, the length comparison should be SARG-able.

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
  • This question should have been closed as dupe. It was exhaustively and repeatedly answered in SO before, including design (truncating trailing spaces), I had given in my update, http://stackoverflow.com/questions/1146280/is-it-good-practice-to-trim-whitespace-leading-and-trailing-when-selecting-inse – Gennady Vanin Геннадий Ванин Oct 15 '10 at 07:28
  • 1
    @vgv8 - I think this one is subtly different, in that the OP seems to be saying that they have to store the trailing whitespace, and that it's significant. – Damien_The_Unbeliever Oct 15 '10 at 07:32