Why is VALUES(CONVERT(XML,'...')) much slower than VALUES(@xml)?

Question

I would like to create a subquery that produces a list of numbers as a single-column result, something like MindLoggedOut did here but without the @x xml variable, so that it can be appended to a WHERE expression as a pure string (subquery) without sql parameters. The problem is that the replacement of the parameter (or variable) makes the query run 5000 times slower, and I don't understand why. What causes this big difference?

Example:

/* Create a minimalistic xml like <b><a>78</a><a>91</a>...</b> */
DECLARE @p_str VARCHAR(MAX) =
    '78 91 01 12 34 56 78 91 01 12 34 56 78 91 01 12 34 56';
DECLARE @p_xml XML = CONVERT(XML,
  '<b><a>'+REPLACE(@p_str,' ','</a><a>')+'</a></b>'
);

SELECT a.value('(child::text())[1]','INT')
FROM (VALUES (@p_xml)) AS t(x)
CROSS APPLY x.nodes('//a') AS x(a);

This returns one number per row and is quite fast (20x faster than the string-splitter approaches I was using so far, similar to these. I measured the 20x speed-up in terms of sql server CPU time, with @p_str containing 3000 numbers.)

Now if I inline the definition of @p_xml into the query:

SELECT a.value('(child::text())[1]','INT')
FROM (VALUES (CONVERT(XML,
  '<b><a>'+REPLACE(@p_str,' ','</a><a>')+'</a></b>'
))) AS t(x)
CROSS APPLY x.nodes('//a') AS x(a);

then it becames 5000x slower (when @p_str contains thousands of numbers.) Looking at the query plan I cannot find the reason for it.

Plan of the first query (…VALUES(@p_xml)…), and the second (…VALUES(CONVERT(XML,'...'))…)

Could somebody shed some light on it?

UPDATE

Clearly the plan of the first query doesn't include the cost of the @p_xml = CONVERT(XML, ...REPLACE(...)... ) assignment, but this cost is not the culprit that could explain the 46ms vs. 234sec difference between the execution time of the whole script (when @p_str is large). This difference is systematic (not random) and was in fact observed in SqlAzure (S1 tier).

Furthermore, when I rewrote the query: replacing CONVERT(XML,...) by a user-defined scalar function:

SELECT a.value('(child::text())[1]','INT')
FROM (VALUES (dbo.MyConvertToXmlFunc(
  '<b><a>'+REPLACE(@p_str,' ','</a><a>')+'</a></b>'
))) AS t(x)
CROSS APPLY x.nodes('//a') AS x(a);

where dbo.MyConvertToXmlFunc() is:

CREATE FUNCTION dbo.MyConvertToXmlFunc(@p_str NVARCHAR(MAX))
RETURNS XML BEGIN
  RETURN CONVERT(XML, @p_str);
END;

the difference disappeared (plan). So at least I have a workaround... but would like to understand it.

Are you able to post a picture of the sql execution plans between the 2 queries so we could see the differences? — John Odom, Aug 18 '15 at 21:29
I suspect it's because the first execution plan is measuring less; i.e. the assignment of `@p_xml` (thus related conversation from string to xml) doesn't show up in the execution plan, so you're not comparing like for like. — JohnLBevan, Aug 18 '15 at 21:29
Any differences in the _Actual_ plans? Any differences in estimated row counts? — Nick.Mc, Aug 19 '15 at 02:21

score 5 · Accepted Answer · edited Jun 20 '20 at 09:12

This is basically the same issue as described in this answer by Paul White.

I tried with a string of length 10,745 characters containing 3,582 items.

The execution plan with the string literal ends up performing the string replace and casting this entire string to XML twice for each item (so 7,164 times in total).

The problematic sqltses.dll!CEsExec::GeneralEval4 calls are highlighted in the traces below. The CPU time for the entire call stack was 22.38% (nearly maxing out a single core on a quad core). - 92% of that was taken with these two calls.

Within each call sqltses.dll!ConvertFromStringTypesAndXmlToXml and sqltses.dll!BhReplaceBhStrStr both take nearly equal time.

I have used the same colour coding for the plan below.

The bottom branch of the execution plan is executed once for each split item in the string.

The problematic table valued function in the bottom right is in its open method. The Parameter list for the function is

Scalar Operator([Expr1000]),

Scalar Operator((7)),

Scalar Operator(XML Reader with XPath filter.[id]),

Scalar Operator(getdescendantlimit(XML Reader with XPath filter.[id]))

For the Stream Aggregate the issue is in its getrow method.

[Expr1010] = Scalar Operator(MIN(
SELECT CASE
         WHEN [Expr1000] IS NULL
           THEN NULL
         ELSE
           CASE
             WHEN datalength([XML Reader with XPath filter].[value]) >= ( 128 )
               THEN CONVERT_IMPLICIT(int, [XML Reader with XPath filter].[lvalue], 0)
             ELSE CONVERT_IMPLICIT(int, [XML Reader with XPath filter].[value], 0)
           END
       END 
))

Both of these expressions refer to Expr1000 (though the stream aggregate only does so to check if it was NULL)

This is defined in the constant scan at the top right as below.

(Scalar Operator(CONVERT(xml,'<b><a>'+replace([@p_str],' '
,CONVERT_IMPLICIT(varchar(max),'</a><a>',0))+'</a></b>',0)))

It is clear from the trace that the issue is the same as in the previously linked answer and that this is getting repeatedly re-evaluated in the slow plan. When passing as a parameter the expensive calculation only happens once.

Edit: I just realised this is in fact almost exactly the same plan and issue as Paul White blogged about here - The only difference in my tests compared to those described there is that I found the string Replace and the XML conversion to be as bad as each other in the VARCHAR(MAX) case - and for the string replace to outweigh the conversion cost in the non max case.

Max

Non Max

(2000 character source string with 668 items. 6010 chars after replace)

In this test the replace was nearly double the CPU cost of the xml conversion. It seems to be implemented by using code from familiar TSQL functions CHARINDEX and STUFF with a large chunk of time taken up converting the string to unicode. I think this discepancy between my results and those reported by Paul is down to collation (switching to SQL_Latin1_General_CP1_CS_AS from Latin1_General_CS_AS reduces the cost of the string replace significantly)

Paul White's [blog article](http://sqlblog.com/blogs/paul_white/archive/2012/09/05/compute-scalars-expressions-and-execution-plan-performance.aspx) not only discusses the very same example/use case that I'm trying to achieve, but he suggests much better workarounds than I discovered. Thanks! — robert4, Aug 26 '15 at 15:54

Why is VALUES(CONVERT(XML,'...')) much slower than VALUES(@xml)?

1 Answers1

Max

Non Max