3

I need to put a rather large xml file into another xml file. I considered using CDATA for this:

http://www.w3.org/TR/2000/REC-xml-20001006#sec-cdata-sect http://www.w3schools.com/xml/xml_cdata.asp

but since my xml might also contain CDATA this does not work unless I do some nasty workaround:

http://web-design.blogs.webucator.com/2010/11/20/nesting-cdata-blocks/

Are there better ways of transferring/encoding large nested xml files or is the xml format simply not meant to be used in this way?

u123
  • 15,603
  • 58
  • 186
  • 303

5 Answers5

4

you can replace the inner ]] with ]]]]><![CDATA[

(based on http://web-design.blogs.webucator.com/2010/11/20/nesting-cdata-blocks/ )

Example: We have outer and inner docs and we want to put the inner inside the outer as CDATA.

<outer>
  <e1 name="abc"/>
  <innerDoc><![CDATA[
      <Doc1/>
    ]]></innerDoc>
</outer>

<inner>
  <innrer1/>
  <a><![CDATA[
      free text with << >
    ]]></a>
  <innrer2/>
</inner>

if we just copy and paste we get an invalid xml

<outer>
  <e1 name="abc"/>
  <innerDoc><![CDATA[
      <inner>
        <innrer1/>
        <a><![CDATA[
            free text with << >
          ]]></a>
        <innrer2/>
      </inner>
    ]]></innerDoc>
</outer>

however, if we replace the inner ]] with ]]]]><![CDATA[ before embedding it we fix the problem

<outer>
  <e1 name="abc"/>
  <innerDoc><![CDATA[
      <inner>
        <innrer1/>
        <a><![CDATA[
            free text with << >
          ]]]]><![CDATA[></a>
        <innrer2/>
      </inner>
    ]]></innerDoc>
</outer>
Shai Ben-Yehuda
  • 134
  • 1
  • 2
  • 8
2

Yes, in your top-most document make the CDATA section of data type bin.base64. That way even if the document you're wrapping contains a CDATA section, you're protected. As an added bonus, your application will also support binary files (images, spreadsheets, etc.).

Here's some code that does it, based on Microsoft ADO, and MSXML.

function wrapBinaryFile( strFileName)
{
    var ado_stream = new ActiveXObject("ADODB.Stream");
    var xml = newXMLDocument();
    xml.loadXML("<file/>");
    xml.documentElement.setAttribute( "name", strFileName );

    xml.documentElement.setAttribute("xmlns:dt","urn:schemas-microsoft-com:datatypes");

    xml.documentElement.dataType = "bin.base64";
    ado_stream.Type = 1; // 1=adTypeBinary
    ado_stream.Open();
    ado_stream.LoadFromFile( strFileName );
    xml.documentElement.nodeTypedValue = ado_stream.Read(-1); // -1=adReadAll
    ado_stream.Close();
    return xml;
}

And how to un-wrap it on the other end...

function unwrapBinaryFile(ndFile, strFileName )
{
    var ado_stream = new ActiveXObject("ADODB.Stream");
    ndFile.dataType = "bin.base64";

    ado_stream.Type = 1; // 1=adTypeBinary
    ado_stream.Open();
    ado_stream.write( ndFile.nodeTypedValue );
    ado_stream.SaveToFile( strFileName, 2 );
    ado_stream.Close(); 
}
William Walseth
  • 2,803
  • 1
  • 23
  • 25
  • Can you do something similar to this on the HttpWebRequest? I need to be able to nest the CDATA tags within the HttpWebRequest. Thanks! – ptn77 Oct 15 '15 at 16:12
  • Sure, create a bin.base64 XML document and then post via the HTTP Request. Remember your wrapping in the XML. What you do with the resulting XML (save to disk, post to server) is up to you. – William Walseth Oct 15 '15 at 16:44
  • How would you specify the contenttype on the HttpWebRequest so that it allows the nested CDATA? I have the following code which fails to show up at the server because of the nested CDATA in the stringData: HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create(Url); myReq.Method = "POST"; byte[] lbPostBuffer = System.Text.Encoding.GetEncoding(1252).GetBytes(stringData); myReq.ContentType = "text/xml; charset=utf-8"; myReq.Accept = "text/xml"; – ptn77 Oct 15 '15 at 16:54
  • See http://stackoverflow.com/questions/17535872/http-post-xml-data-in-c-sharp for a good example – William Walseth Oct 15 '15 at 20:39
  • Thanks for your help. I got around my problem by using the HttpClient with models and serialization with the Web api framework. – ptn77 Oct 15 '15 at 22:37
2

First XML :

<root>
    <data1 value="test1" />
    <data2>
        <value>test2</value>
    </data2>
</root>

Second XML :

<root2>
    <data3 value="test3" />
    <data4>
        <value>test4</value>
    </data2>
</root2>

You can include second XML in first with a specific node :

<root>
    <data1 value="test1" />
    <data2>
        <value>test2</value>
    </data2>
    <dataFromSecondXML>
        <data3 value="test3" />
        <data4>
            <value>test4</value>
        </data2>
    </dataFromSecondXML>
</root>
TeChn4K
  • 2,317
  • 2
  • 21
  • 23
1

XML is hierarchic: why can't you nest the documents directly, without CDATA? Apart from DTD issues, any XML document can be copied as the content of an element in another document.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
0

The short answer is that XML is not meant to be used in this way!

However, if you base64-encode the XML file to be packaged, the encoded result will not contain any characters which might be interpreted either as markup, or as entity references, and can safely be held as the contents of a text node.

Max
  • 2,121
  • 3
  • 16
  • 20
  • 2
    If XML documents are not to be embedded in other XML documents, I'd guess XSLT and SOAP have something of a problem. – G_H Nov 14 '11 at 16:43
  • XML elements can certainly be embedded in other XML documents: However an XML document SHOULD include an XML declaration, which can only appear at the satrt of the document, hence cannot be embedded without encoding in some way. – Max Nov 14 '11 at 16:56
  • It's usually little work to get around that. Whether using SAX or DOM, or some other technology, using only the root node as an element or turning an element into a new document is trivially easy. Copying over the XML declaration could even prove harmful if the specified encoding is no longer appropriate. – G_H Nov 14 '11 at 16:59