0

NOTE: Problem solved but please read ikegami's response below. Terrifically informative, especially the link about avoiding XML::Simple.

I just started working with a corporation that make extensive use of XML::Simple and we are now having parsing issues.

Here's a sample XML file... ( note first part commented out )

<xyz:CostFee>
        <ec:OPA>25.00</ec:OPA>
        <ec:CTID>278421</ec:CTID>
        <xyz:CDEPSID>82</xyz:CDEPSID>
        <ec:IID>8765654</ec:IID>
</xyz:CostFee>

I am using this simple perl script ....

#!/usr/bin/perl

use XML::Simple;
use Data::Dumper;

my $content = XMLin('./data.xml');
print Dumper($content);

Running the script yields this.....

Undeclared prefix: xyz at /System/Library/Perl/Extras/5.18/XML/NamespaceSupport.pm line 298.
XML::Simple called at ./xml_test.pl line 6.

When I use this in the XML file...

<catalog>
        <part partnum="184324" desc="Desc 1" price="19.00" />
        <part partnum="765398" desc="Desc 2" price="18.00" />
        <part partnum="878998" desc="Desc 3" price="15.00"/>
</catalog>

It works just fine and Dumper happily dumps it out.....

Since we are talking about a legacy program and replacing XML::Simple isn't desired ( but honestly I don't think you can register a namespace in XML::Simple but I am by no means an expert ).

Can anyone point me in the right direction with a pointer or two? I am thinking that including namespace info as part of the XML content might be the way to go, something like......

<xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform">

Many thanks JW

Jane Wilkie
  • 1,703
  • 3
  • 25
  • 49

2 Answers2

2

The node needs to be a child of a node with the following attribute:

xmlns:xyz="..."

(As much as you should avoid XML::Simple, changing parser isn't going to work if you have invalid XML.)

For example, changing

<doc>
   <xyz:CostFee>
      <ec:OPA>25.00</ec:OPA>
      <ec:CTID>278421</ec:CID>
      <xyz:CDEPSID>82</xyz:CDEPSID>
      <ec:IID>8765654</ec:IID>
   </xyz:CostFee>
</doc>

to

<doc xmlns:xyz="..." xmlns:ec="...">
   <xyz:CostFee>
      <ec:OPA>25.00</ec:OPA>
      <ec:CTID>278421</ec:CTID>
      <xyz:CDEPSID>82</xyz:CDEPSID>
      <ec:IID>8765654</ec:IID>
   </xyz:CostFee>
</doc>

allows the document to be parsed. (Note the addition of the prefix declarations, as well as the change from </ec:CTID> to </ec:CID>. Use the proper URNs instead of ....)

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • My solution was to use XML::Parser on the backend instead of XML::Sax which solved my issue. The link you provided regarding avoiding XML::Simple was *SO* informative and I plan to use that as justification to get rid of it altogether. I thank you so much ikegami. JW – Jane Wilkie Jan 09 '20 at 20:08
  • 1
    The XML::Simple+XML::Parser combo is not aware of namespaces, so it doesn't make sense to use that combo with documents that use namespace. You'd have to hardcode the prefixes rather than namespaces, which is a big no-no. One is expected to know an element's namespace, not what prefix (if any) is used to indicates the element's namespace. The id used for the prefix is meaningless and can be different between documents without affecting schema compliance. – ikegami Jan 09 '20 at 20:35
  • 1
    For example, ``, `` and `` are equivalent documents, but XML::Simple+XML::Parser parses the differently. – ikegami Jan 09 '20 at 20:41
  • OK so if XML::Simple+XML::Sax and XML::Simple+XML::Parser both exhibit namespace issues, what is XML::LibXML using to be so successful? Just trying to understand – Jane Wilkie Jan 10 '20 at 15:32
  • The XML above was corrected the first time you pointed that out. In any case, this issue is closed and your input was invaluable. – Jane Wilkie Jan 10 '20 at 15:59
  • Moving this to chat for future discussion – Jane Wilkie Jan 10 '20 at 16:01
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/205761/discussion-between-jane-wilkie-and-ikegami). – Jane Wilkie Jan 10 '20 at 16:02
  • 1
    [Summary] XML::Simple+XML::SAX with `NSExpand => 1` doesn't exhibit namespace issues. Of course, you still have all the other problems of using XML::Simple. With the setting, you can use namespaces instead of prefixes as you should. /// The problem you asked about wasn't a namespace issue; it was invalid XML. – ikegami Jan 10 '20 at 16:14
1

One line fix.... just set the

$XML::Simple::PREFERRED_PARSER = "XML::Parser".

before calling the XMLin.

Jane Wilkie
  • 1,703
  • 3
  • 25
  • 49