1

Have such XML file - t.xml

<?xml version="1.0"?>
<ArrayOfFiles xmlns="Our.Files" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
        <File>
                <DownloadCount>1</DownloadCount>
                <Id>11</Id>
        </File>
        <File>
                <DownloadCount>2</DownloadCount>
                <Id>22</Id>
        </File>
</ArrayOfFiles>

The xmlns declaration is invalid, the xmlstarlet complains about it, e.g. using:

xmlstarlet sel -t -v "//File/Id" t.xml

prints

t.xml:2.32: xmlns: URI Our.Files is not absolute
<ArrayOfFiles xmlns="Our.Files" xmlns:i="http://www.w3.org/2001/XMLSchema-instan

Probably for the same reason I can't get work the following perl code too:

use 5.014;
use warnings;
use XML::LibXML;

my $dom = XML::LibXML->new->parse_file('t.xml');
my $res = $dom->findnodes('//File/Id');
say $_->textContent for $res->get_nodelist;

When I omit the xmlns declarations, e.g. trying to parse this modified XML file

<?xml version="1.0"?>
<ArrayOfFiles>
    <File>
        <DownloadCount>1</DownloadCount>
        <Id>11</Id>
    </File>
    <File>
        <DownloadCount>2</DownloadCount>
        <Id>22</Id>
    </File>
</ArrayOfFiles>

The above code DWIM - and prints:

11
22

The question is, how to parse the original XML file, because it is downloaded from the external site - so I must deal with it somewhat...

ikegami
  • 367,544
  • 15
  • 269
  • 518
kobame
  • 5,766
  • 3
  • 31
  • 62

1 Answers1

7

That's just a warning. When working with XML namespaces, use XML::LibXML::XPathContext:

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

use XML::LibXML;
use XML::LibXML::XPathContext;


my $dom = 'XML::LibXML'->load_xml(location => shift);

my $xpc = 'XML::LibXML::XPathContext'->new($dom);
$xpc->registerNs(o => 'Our.Files');

my $res = $xpc->findnodes('//o:File/o:Id');
say $_->textContent for $res->get_nodelist;
choroba
  • 231,213
  • 25
  • 204
  • 289
  • YES!!! Exactly this is what I need! I was read the doc and saw the namespace related warnings (in more places) but honestly - absolutely don't understand how to use it - yet. :) Thank you very very much. – kobame Jun 15 '17 at 13:30
  • hmm.. wondering why do you using quoted `'XML::LibXML'->load` instead of the plain `XML::LibXML->load` - but this should be an another question... :) – kobame Jun 15 '17 at 13:32
  • @kobame See [Invoking Class Methods](https://perldoc.perl.org/perlobj.html#Invoking-Class-Methods) ... It forces the invocation of `new` in `XML::LibXML::XPathContext` even if there is a function `XML::LibXML::XPathContext` in scope ... I've never been bitten by this, so I never use it, but it does ensure correct behavior in that corner case. – Sinan Ünür Jun 15 '17 at 13:37
  • See also https://stackoverflow.com/a/23312776/1030675 and http://www.perlmonks.org/?node_id=1083985 linked from there for nasty details. – choroba Jun 15 '17 at 14:08
  • Here is an interesting link from the Perl-XML project: [Perl-XML FAQ](http://perl-xml.sourceforge.net/faq/). Section _8.7. Using XPath with Namespaces_ gives additional information regarding this "issue". – TonioGA Dec 19 '17 at 09:36