It's not clear what you're trying to do, but this might help clear things up.
A <![CDATA[...]]>
entry isn't a tag, it's a block, and is treated differently by the parser. When the block is encountered the <![CDATA[
and ]]>
are stripped off so you'll only see the string inside. See "What does <![CDATA[]]> in XML mean?" for more information.
If you're trying to create a CDATA block in XML it can be done easily using:
doc = Nokogiri::XML(%(<string name="key"></string>))
doc.at('string') << Nokogiri::XML::CDATA.new(Nokogiri::XML::Document.new, "Hey I'm a tag with & and other characters")
doc.to_xml # => "<?xml version=\"1.0\"?>\n<string name=\"key\"><![CDATA[Hey I'm a tag with & and other characters]]></string>\n"
<<
is just shorthand to create a child node.
Trying to use inner_html
doesn't do what you want as it creates a text node as a child:
doc = Nokogiri::XML(%(<string name="key"></string>))
doc.at('string').inner_html = "Hey I'm a tag with & and other characters"
doc.to_xml # => "<?xml version=\"1.0\"?>\n<string name=\"key\">Hey I'm a tag with & and other characters</string>\n"
doc.at('string').children.first.text # => "Hey I'm a tag with & and other characters"
doc.at('string').children.first.class # => Nokogiri::XML::Text
Using inner_html
causes HTML encoding of the string to occur, which is the alternative way of embedding text that could include tags. Without the encoding or using CDATA
the XML parsers could get confused about what is text versus what is a real tag. I've written RSS aggregators, and having to deal with incorrectly encoded embedded HTML in a feed is a pain.