0

I am reading an XML file with XML::Simple ( XMLin), doing some substitute operations in some of it's attributes and then XMLOut it in another file. What I noticed is that some of the attributes contained CDATA before and after XMLOut they don't anymore.

Input example: <name><![CDATA[some text here]]></name>

Output : <name>some text here</name>

Is there an option to keep the CDATA attr? ( I know what CDATA stands for and why it's used)

ikegami
  • 367,544
  • 15
  • 269
  • 518
Adrian
  • 31
  • 8
  • 6
    Obligatory link: [Why is XML::Simple "Discouraged"?](http://stackoverflow.com/questions/33267765/why-is-xmlsimple-discouraged) – ThisSuitIsBlackNot Mar 19 '17 at 13:56
  • @ThisSuitIsBlackNot I understand their points, but I'd still like to keep it `Simple`. I have used it a lot in a very large area and it didn't dissapoint me so far. – Adrian Mar 19 '17 at 14:21
  • 1
    It is not simple when even the author himself is [recommending](https://metacpan.org/pod/XML::Simple#STATUS-OF-THIS-MODULE) not to use it. Who knows, you may be running into one of the unknown side-effects they speak of. – stevieb Mar 19 '17 at 15:07
  • 2
    @Adrian: If you *"know what CDATA stands for and why it's used"* then you will know that the plain and CDATA representation of the string are equivalent. No application that handles XML properly will care which is used. `XML::Simple` is *naïve* rather than being simple to use, and you are strongly recommended to avoid it. *"it didn't dissapoint me so far"* Perhaps it just did? – Borodin Mar 19 '17 at 17:16
  • 1
    See [Preserving CDATA Tags When Saving XML](http://www.perlmonks.org/?node_id=701522) on PerlMonks. – ThisSuitIsBlackNot Mar 19 '17 at 18:17
  • 1
    Possible duplicate of [Why is XML::Simple "Discouraged"?](http://stackoverflow.com/questions/33267765/why-is-xmlsimple-discouraged) – Shaishav Jogani Mar 19 '17 at 18:43
  • 1
    Re "*but I'd still like to keep it `Simple`*", It's the most complex parser to use. It's particularly awful at outputting XML. That said, the two XML documents you posted are 100% equivalent. – ikegami Mar 20 '17 at 15:47
  • As the author of XML::Simple you might think I'm here to defend the module - I'm not. Also, I can confirm it cannot do what you're asking with respect to CDATA sections. I personally use XML::LibXML and find it both simpler and more consistent. I have written a [tutorial on XML::LibXML](http://grantm.github.io/perl-libxml-by-example/) to supplement the reference documentation. – Grant McLean Mar 22 '17 at 00:12

2 Answers2

2

The fact that the text was provided via a CDATA section is lost during parsing. Furthermore, there XML::Simple never produces CDATA sections.

Note that the two XML documents you presented are 100% equivalent. But if you absolutely want to preserve the CDATA sections, I recommend switching to XML::LibXML[1].

$ perl -MXML::LibXML -e'
   my $xml = "<name><![CDATA[some text here]]></name>";
   XML::LibXML->new->parse_string($xml)->toFH(\*STDOUT);
'
<?xml version="1.0"?>
<name><![CDATA[some text here]]></name>

The conversion should be relatively simple since both XML::Simple and XML::LibXML provide functionally similar interfaces. For example,

  • my $val = $node->{attr};my $val = $node->getAttribute('attr')
  • $node->{attr} = $val;$node->setAttribute('attr', $val)
  • for (@$node)for ($node->getChildren())

  1. I recommend switching no matter what. It'll make your life so much simpler!
Community
  • 1
  • 1
ikegami
  • 367,544
  • 15
  • 269
  • 518
1

Look, I know you say in the comments, you want to keep it simple, by using XML::Simple. But that's a misnomer. XML::Simple isn't - it's actually quite complicated. It's for "simple" XML.

And it's "discouraged" (and even the module info says that) and you have a read through, you'll see why.

But some truly excellent alternatives exist. I'd suggest taking a look at either XML::Twig - which has a lower learning curve - or XML::LibXML which is more fully featured. If you give us some example XML, and what you've tried so far - we can give you an example that'll do what you want. It'll likely be simpler than what you've done so far, too.

Community
  • 1
  • 1
Sobrique
  • 52,974
  • 7
  • 60
  • 101