9

I observed some strange behavior in MS SQL Server 2017.

  • += in select acts as aggregator ('concatenate values from all rows') when on the right is a constant.
  • += in select acts as 'just set the value' when on the right is a column name. (also, this turns of aggregating behavior for other columns)

So my questions are:

  1. Why @c1 result contains value only last the row, even when += is used?
  2. Why is @c2 affected, by the change +=->= for @c1?

Version 1:

BEGIN
    DECLARE
        @c1 NVARCHAR(MAX) = N'',
        @c2 NVARCHAR(MAX) = N'';

    SELECT
        @c1 = constraint_name, -- version-1
        @c2 += '+'
    FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
    ORDER BY 1 DESC
    ;

    PRINT '@c1=' + @c1;
    PRINT '@c2=' + @c2;
END
;

Version 1 result:

@c1 = fk_abcde
@c2 = ++++++++++++++++++++++++++++++++++++++++++
(`@c2` result is aggregation of many rows; one plus for each row)

Version 2:

BEGIN
    DECLARE
        @c1 NVARCHAR(MAX) = N'',
        @c2 NVARCHAR(MAX) = N'';

    SELECT
        @c1 += constraint_name, -- version-2
        @c2 += '+'
    FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
    ORDER BY 1 DESC
    ;

    PRINT '@c1=' + @c1;
    PRINT '@c2=' + @c2;
END
;

Version 2 Result:

@c1 = fk_abcde
@c2 = +
(`@c2` is just value assigned from last processed row)

This feels very strange - kind of like a bug. I fail to find any docs about this. The doc on '+= string' don't mention += usage in select query at all.

(at the moment my goal is to understand the behavior fully, so I would not accidentally step on it. Any hints to right documentation/keywords to search for would be helpful)

Michał Turczyn
  • 32,028
  • 14
  • 47
  • 69
IndustryUser1942
  • 1,052
  • 1
  • 8
  • 12
  • 8
    It's in the wrong place in the documentation, so not surprising that you haven't [found it](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-local-variable-transact-sql?view=sql-server-2017#remarks): "Don't use a variable in a SELECT statement to concatenate values (that is, to compute aggregate values). Unexpected query results may occur. Because, all expressions in the SELECT list (including assignments) aren't necessarily run exactly once for each output row" – Damien_The_Unbeliever Feb 27 '19 at 06:52
  • 2
    @Damien_The_Unbeliever To add to your comment, I was going to suggest that what we are seeing in the second case is just the variable concatenation happening once, the other times being ignored or discarded. – Tim Biegeleisen Feb 27 '19 at 06:53
  • 1
    And to add some more: This approach is called [quirky update](https://www.google.com/search?q=sql+server+update+quirky) and is - in most cases - something to avoid... – Shnugo Feb 27 '19 at 07:39
  • 1
    @Damien_The_Unbeliever I believe your comment and the link to the docs should be the answer here. – EzLo Feb 27 '19 at 07:46
  • 1
    @IndustryUser1942 what are you trying to do? If you want to aggregate strings use `STRING_AGG` in SQL Server 2017. There are other ways to do the same in previous versions, all of them described in [Aaron Bertrand's articles](https://sqlperformance.com/2014/08/t-sql-queries/sql-server-grouped-concatenation). The fastest and most scalable ways are to use a SQLCLR UDF or `FOR XML` – Panagiotis Kanavos Feb 27 '19 at 07:49
  • @IndustryUser1942 what you posted is a hack often used in MySQL, that doesn't really work there either. It depends on the server using a very specific way to execute the query and produce results which changes from version to version and can't ever include parallel execution. – Panagiotis Kanavos Feb 27 '19 at 07:52
  • @Damien_The_Unbeliever Please copy your comment (link + quote) as an answer. – IndustryUser1942 Feb 28 '19 at 05:52
  • @PanagiotisKanavos Initially I wanted to aggregate/concatenate strings. For that I will use `STRING_AGG`. Thanks. – IndustryUser1942 Feb 28 '19 at 05:55

3 Answers3

2

It's in the wrong place in the documentation, so not surprising that you haven't found it:

Don't use a variable in a SELECT statement to concatenate values (that is, to compute aggregate values). Unexpected query results may occur. Because, all expressions in the SELECT list (including assignments) aren't necessarily run exactly once for each output row

It's better to look for different ways of doing string concatenation. If your version supports it, choose to use STRING_AGG. For earlier versions, Aaron Bertrand has provided a good set of options (hat tip to Panagiotis Kanavos for providing the link)

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
-1

Well, what I found is quite interesting: without ORDER BY query runs consistently and expectedly.

But when adding ORDER BY caluse, I get same results as you.

What I suggest is to use CTE with ordering - unfortunately ordering isn't allowed in CTE or subqueries, but workaround is to use TOP keyword, which allows us to order in such situations.

See script below:

;with cte as (
    select top 100 percent table_schema
    from information_schema.columns
    order by 1
)
-- This is more reliable
select @c1 = table_schema, @c2 += '+' from cte
Michał Turczyn
  • 32,028
  • 14
  • 47
  • 69
  • 3
    There's a reason this is called a `quirky update` and discouraged against. It works by accident and can *always* break even among minor patch versions, or even if eg a parallel operation changes how the results are processed. – Panagiotis Kanavos Feb 27 '19 at 08:52
-2

You should use Order by constraint_name instead of Order by 1, because you're assigning values into variable. Its not select statement.

Arulkumar
  • 12,966
  • 14
  • 47
  • 68
Thangadurai.B
  • 561
  • 1
  • 3
  • 18
  • 4
    Doesn't matter how you tweak it. As I've linked to in the comments, using variables for aggregation in a `SELECT` is *documented* to be inherently unreliable. – Damien_The_Unbeliever Feb 27 '19 at 07:25