I've always found the lack of a built-in aggregate function that just strings together values using some user defined separator, frustrating.
The closest thing I know of is the XML hack:
select s.string as [text()]
from strings s
for xml path('')
But in my opinion that's a very strange way of doing it and you still have to handle the case with trailing or leading separator junk. So, I thought I'd write a CLR aggregate:
select dbo.Fold(s.string, ', ')
from strings s
That's kind of nice. Except this thing doesn't perform very well when the number of rows reaches 5000. I don't really expect to run it against that much data but to my surprise the performance of the aggregate degrades rather rapidly as the data set grows. I didn't understand why at first, then I realized that the way the UDF is handled in SQL Server is the problem.
I'm using an UDF with a StringBuilder
class, because of this I have to implement IBinarySerialize
. But for reasons I do not understand SQL Server is calling the Write/Read pair between each iteration of Accumulate
. Obviously this creates a performance problem when the string gets big. Is there anyway to avoid this costly serialization?
FYI, the XML hack are orders of magnitudes faster.