0

This is my xml. My goal is to wrap the data inside the Value node with CDATA on an export and then import that back into an Xml type column with the CDATA removed.

<Custom>
     <Table>Shape</Table>
     <Column>CustomScreen</Column>
     <Value>Data</Value>
<Custom>

Right now I am replacing 'Data' inside the Value node with the XML from the table and then I believe I am putting CData around it, Where ShapeInfo is type XML and CustomPanel is the first node of [ShapeInfo] XML.

SET @OutputXML= replace(@OutputXML, 'Data', CAST((SELECT [ShapeInfo]      
                         FROM [Shape] WHERE [Shape_ID] = @ShapeID) as VARCHAR(MAX))

SET @OutputXML= replace(@OutputXML, '<CustomPanel', '<![CDATA[<CustomPanel')

However the result looks something like this even though I expected it to only have CDATA around the information:

<Value>&lt;CustomPanel VisibilityIndicator=""&gt;&lText="No" Checked="False" Height="20" Width="50"/&gt;&lt;/Cell&gt;&lt;/Row&gt;&lt;/Table&gt;&lt;/CustomPanel&gt;</Value>

Then i am doing some dynamic sql to update that column

EXEC('UPDATE ['+ @tableName +  '] SET [' + @columnName + '] = ''' + @nodeValue + ''' WHERE Shape_ID = ''' + @ShapeID + '''')

I was told I might be able to use the following to remove CDATA but I didn't use it.

declare @x xml
set @x=N'<Value>&lt;CustomPanel....... all the current info ...=&quot;&quot;&gt;</Value>'

select @x.value('(/Value)[1]', 'nvarchar(max)')

select '<![CDATA[' + @x.value('(/Value)[1]', 'nvarchar(max)') + ']]'

After checking the column again it seems that it contains the correct information. However I never changed it back to XML from VARCHAR or removed the CDATA symbols even though they seem to be gone when I checked the column. So what am I missing here? Is this a correct way to do it?

DRockClimber
  • 137
  • 2
  • 9

2 Answers2

2

If you need full control over generating XML, you can use FOR XML EXPLICIT:

DECLARE @xml xml = '<Custom>
     <Table>Shape</Table>
     <Column>CustomScreen</Column>
     <Value>Data</Value>
</Custom>';

WITH rawValues AS
(
    SELECT
        n.value('Table[1]', 'nvarchar(20)') [Table],
        n.value('Column[1]', 'nvarchar(20)') [Column],
        n.value('Value[1]', 'nvarchar(20)') [Value]
    FROM @xml.nodes('Custom') X(n)
)
SELECT 1 AS Tag,
       NULL AS Parent,
       [Table] AS [Custom!1!Table!ELEMENT],
       [Column] AS [Custom!1!Column!ELEMENT],
       [Value] AS [Custom!1!Value!CDATA]
FROM rawValues 
FOR XML EXPLICIT

It generates:

<Custom>
  <Table>Shape</Table>
  <Column>CustomScreen</Column>
  <Value><![CDATA[Data]]></Value>
</Custom>

If you need reverse, replace source XML and use ELEMENT instead of CDATA.

Paweł Dyl
  • 8,888
  • 1
  • 11
  • 27
  • I like this answer, +1 from my side, but - AFAIC - one should think twice about fiddling `CDATA` into XML. It is just not needed in most cases... – Shnugo Aug 19 '16 at 08:19
  • I ended up doing string concatenation even though I attempted doing it the way you wrote. – DRockClimber Aug 30 '16 at 23:39
1

If you really need the CDATA section within your XML, there are only two options

  • string concatenation (very bad)
  • FOR XML EXPLICIT (in this case you've got the answer from Pawel)

But you should take into consideration, that the CDATA section exists for lazy input only. There is absolutely no difference whether the content is enclosed as CDATA section or properly escaped. Therefore Microsoft decided not even to support the CDATA syntax in modern XML methods. It is just not needed.

Look at these examples:

--I start with a string containing the same content escaped and in CDATA

DECLARE @s VARCHAR(500)=
'<root>
<a>Normal Text</a>
<a>Text with forbidden character &amp; &lt;&gt;</a>
<a><![CDATA[Text with forbidden character & <>]]></a>
</root>';

--This string is casted to XML.

DECLARE @x XML=CAST(@s AS XML);

--This is the output, and you can see, that the CDATA section is encoded an no CDATA any more. CDATA will always be replaced by a valid escaped string:

SELECT @x;

<root>
  <a>Normal Text</a>
  <a>Text with forbidden character &amp; &lt;&gt;</a>
  <a>Text with forbidden character &amp; &lt;&gt;</a>
</root>

--The back-cast shows clearly, that the XML internally has no CDATA any more

SELECT CAST(@x AS VARCHAR(500));

<root>
   <a>Normal Text</a>
   <a>Text with forbidden character &amp; &lt;&gt;</a>
   <a>Text with forbidden character &amp; &lt;&gt;</a>
</root>

--Reading the nodes one-by-one shows the correct content anyway

SELECT a.value('.','varchar(max)')
FROM @x.nodes('/root/a') AS A(a)

Normal Text
Text with forbidden character & <>
Text with forbidden character & <>

The only reason to use CDATA and to insist, that this must be included into the XML's text representation (which is not the XML!) are third party or legacy requirements.

And keep in mind: If you use string concatenation, you can store the XML with a readable CDATA in a string format only. Whenever you cast this to XML the CDATA will be ommited. Using FOR XML EXPLICIT allows the typesafe storage, but is very clumsy with deeper nestings. This might be OK with an external interface, but you should think twice about this...

Two links to related answers (by me :-) ):

Community
  • 1
  • 1
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • @Schnugo Good information like always. I ended up doing the string concatenation. I have a question though. Isn't CData useful in cases where "sql injection" may happen? I was told it would help against people inserting random stuff. – DRockClimber Aug 30 '16 at 23:37