-1

I am trying to get the value of the name attribute of any section elements in some XML data.

my $some_att = $fileLocation->findnodes("//section[/@name]");

Can someone please explain what is wrong with this syntax?

Please note that the variable $fileLocation here opens the file location for the XML I am working with.

Borodin
  • 126,100
  • 9
  • 70
  • 144
  • Are you using `XML::LibXML`? – Borodin Nov 02 '17 at 22:23
  • `//section[/@name]` means "`section` elements in the null namespace which have a root element with an attribute named `name` in the null namespace." On the other hand, `//section[@name]` means "`section` elements in the null namespace which have an attribute named `name` in the null namespace." – ikegami Nov 03 '17 at 19:47

2 Answers2

2

I'm assuming that you're using XML::LibXML?

It's very important to explain what tools (library, language, operating system) you are using, as well as the errant behaviour you are seeing.

Your "Please note that the variable $fileLocation here opens the file location for the XML I am working with" is troubling. It doesn't make much sense (it's a variable and cannot open anything) and the identifier you have chosen implies that it is a path to the XML file. But to be able to call findnodes on it, it must be a DOM object, more specifically an instance of XML::LibXML::Node or a subclass.

Your code should look more like this

use XML::LibXML;

my $xml_file = '/path/to/file.xml';

my $dom = XML::LibXML->load_xml(
    location => $xml_file
);

my @sections = $dom->findnodes('//section');

for my $section ( @sections ) {
    next unless $section->hasAttribute('name');
    say $section->getAttribute('name');
}

The result of the findnodes method in scalar context is not a single XML::LibXML::Node object, but instead an XML::LibXML::NodeList, which is overloaded so that it bahaves similarly to a reference to an array

You don't say what errors you are getting, but from your "Can someone please explain what is wrong with this syntax?" I imagine that the module is rejecting your XPath expression?

In this statement

my $some_att = $fileLocation->findnodes("//section[/@name]")

the problem is with the predicate [/@name] which, if it were correct, would filter the section elements to include only those that have a name attribute. Because it is a predicate it doesn't need a child axis, and so should be written as //section[@name]

But that will only find all section elements that have a name attribute. To select the attributes themselves you need to write //section/@name, something like this

 my $section_names = $fileLocation->findnodes('//section/@name')

Then you will have an XML::LibXML::NodeList of XML::LibXML::Attr objects, and you can extract the list of their values using something similar to

my @section_names = map { $_->value } $section_names->get_nodelist

You may instead prefer to start with a list of all section elements using the XPath expression //section. That would give you a collection of XML::LibXML::Element objects, from which you can extract the name element using $elem->getAttribute('name')

Remember that you may work with arrays instead of XML::LibXML::NodeList objects if you prefer, by choosing list context instead of scalar context in the call to findnodes as described in mob's answer

Borodin
  • 126,100
  • 9
  • 70
  • 144
1

I don't know Perl but I assume that findnodes() is designed to evaluate an XPath expression. Your expression

"//section[/@name]"

is syntactically correct, but semantically, it's nonsense. (As an aside, I wonder how people come up with such things? I can only imagine you're cutting and pasting from examples that you don't understand, without ever going back to the spec to see what it actually means).

Two main errors here.

  • Firstly, square brackets represent a predicate or filter: you're selecting the sections that satisfy some condition, but your prose requirement (a) says you want to retrieve names (not sections), and (b) says nothing about filtering the list.
  • Secondly, /@name is void. A '/' at the start of an expression selects the root (document) node, while @name selects an attribute. Document nodes don't have attributes, so this selects nothing.

The expression you want is //section/@name. (What you do with the names once you get them into Perl-space is outside my knowledge.)

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • `XML::LibXML` is just a Perl wrapper for `libxml2`, which has [any number of language bindings and wrappers](http://xmlsoft.org/python.html). If you know the library in any language (and it's hard to get away from if you work with XML data) then the Perl ideas and classes should mirror what you already know pretty well. – Borodin Nov 03 '17 at 14:25
  • Thanks Michael. Of course you're right. Developing an XSLT engine does wonders for one's knowledge of XPath syntax! The author's *"Can someone please explain what is wrong with this syntax?"* led me to assume that they were seeing a syntax error reported, and I'm not confident enough with my knowledge of XPath to reject that. (By the way, I graduated in Reading, working with Algol 68 on punched cards on an ICL 1904S!) – Borodin Nov 03 '17 at 14:32