Make SQL Server CLR aggregate similar to native aggregates

Question

I'm comparing my custom CLR aggregate vs AVG (SQL Server 2017). My queries are:

  SELECT groupId, Helpers.CustomCLR(value)
  FROM table
 group by groupId

  SELECT groupId, AVG(value)
  FROM table
 group by groupId

And CLR is

    [Serializable]
    [SqlUserDefinedAggregate(
        Format.Native, //use clr serialization to serialize the intermediate result  
        IsInvariantToNulls = true, //optimizer property  
        IsInvariantToDuplicates = false, //optimizer property  
        IsInvariantToOrder = true)
]
    [StructLayout(LayoutKind.Sequential)]
    public class CustomCLR
    {
        float a = 2;
        public void Init()
        {
        }

        public void Accumulate(SqlSingle value)
        {
        }

        public void Merge(CustomCLR other)
        {
        }

        public double? Terminate()
        {
            return a;
        }
    }

Execution plans are quite different, though. CLR query does row mode sorting and AVG query does batch mode hash match. How to make CLR aggregate behave like AVG one?

There doesn't seem to be any official documentation on this so I am going to ask directly. But I am first going to test on SQL Server 2019 CTP 2.2 since they just added support for Batch Mode on row store table (no longer only working with columnstore). So _maybe_ that changes things? We shall see ;-) — Solomon Rutzky, Feb 15 '19 at 16:15
@SolomonRutzky, It was how to make both queries to have similar plan. Additional info about Batch Mode for ColumnStore indexes would still be very useful, though. — user2820173, Feb 15 '19 at 16:25
@gotqn While it is usually true that built-in functionality is faster than SQLCLR, there are times when SQLCLR is faster. For example, when calculating a SHA-256 hash (assuming the input is `VARBINARY(8000)` and _not_ `VARBINARY(MAX)`): [What is a scalable way to simulate HASHBYTES using a SQL CLR scalar function?](https://dba.stackexchange.com/a/228966/30859). And to both of you: my testing so far indicates that SQLCLR cannot participate in Batch Mode, either as an Aggregate or even a deterministic UDF. I am writing to MS to get confirmation. — Solomon Rutzky, Feb 15 '19 at 17:31
@gotqn and user2820173: I did get confirmation today from Microsoft that SQLCLR UDAs definitely cannot participate in Batch Mode. However, with 2019 allowing Batch Mode in row store tables, there might be increased opportunity to push something like this through. — Solomon Rutzky, Feb 15 '19 at 22:26

Solomon Rutzky · Accepted Answer · 2019-02-15T15:32:59.227

There are definitely some unfortunate differences between built-in and SQLCLR User-Defined Aggregate functions (UDA). One of them should be that SQLCLR cannot do batch mode. I will see if I can find an authoritative reference for this.

Another difference is that SQLCLR UDAs do not support the HashAggregate operator, resulting in:

CLR Aggregate performs an expensive sort

I just tested again in SQL Server 2017 CU 12 and SQL Server 2019 CTP 2.2 and it is still an issue. Please see Bob Beauchemin's post (in that linked forum thread) on Thursday, December 9, 2010 for some suggested work-arounds.

Please also support Bob's enhancement request to allow SQLCLR UDAs to use OPTION(HASH GROUP):

Allow OPTION(HASH GROUP) with SQLCLR UDAs

Romain Ferraton · Answer 2 · 2023-03-15T20:21:01.700

I think Microsoft SQL Server Team may have change some little things on SQL Server 2022 (or may be it work already like this in 2019 ... I didn't test on 2019) but

on a 2022 Dev Edition, I found that you can speed up CLR UDF using some trick when you have to use the CLR UDF in a query that use an aggregation.

In my case i have a hash function that return an integer hash given a varbinary.

Code is often clearer than long explain :

 select 
[DT_PERIODE], 
[SQLHash].[dbo].[XF_HashMurmur2_32](cast([ID_ARTICLE] as varbinary(8000))), 
SUM( [VB_MM2_SOC])
FROM [dbo].[TEST_71_1M_12m]
GROUP BY 
[DT_PERIODE], 
[ID_ARTICLE];

This above query is 8 times faster that the following one

select 
 [DT_PERIODE], 
[SQLHash].[dbo].[XF_HashMurmur2_32](cast([ID_ARTICLE] as varbinary(8000))), 
SUM( [VB_MM2_SOC])
FROM [dbo].[TEST_71_1M_12m]
GROUP BY 
[DT_PERIODE], 
[SQLHash].[dbo].[XF_HashMurmur2_32](cast([ID_ARTICLE] as varbinary(8000)));

The CLR cannot use batch mode but in the first query the aggregation is done first in batch mode, only the last compute scalar node is in row mode. In the second query, the columnstore is read in batch mode but just after that the computing scalar node switch in row mode and the batch mode is switched back in the hast match aggregate node.

The elasped times :

Query 1 : CPU time = 172 ms, elapsed time = 168 ms. (97651 rows affected)
Query 2 : CPU time = 891 ms, elapsed time = 996 ms. (97651 rows affected)

Make SQL Server CLR aggregate similar to native aggregates

2 Answers2