I have an xml file.
<?xml version="1.0" encoding="UTF-8"?> <doc>
<!-- A comment -->
<a xmlns="http://www.tei-c.org/ns/1.0">
<w>word
</w>
<w>wording
</w>
</a>
</doc>
I would like to return nodes containing "word" but not "wording".
library(XML) # I have nothing against using library(xml2) or library(xml2r) instead
test2 <- xmlParse("file.xml", encoding="UTF-8")
x <- c(x="http://www.tei-c.org/ns/1.0")
# starts-with seems to find the words just fine
test1 <- getNodeSet(doc, "//x:w[starts-with(., 'word')]", x)
# but R doesn't seem to allow "matches" to be included
# in the xpath query, hence none of the following work:
test1 <- getNodeSet(doc, "//x:w[[matches(., 'word')]]", x)
test1 <- getNodeSet(doc, "//x:w[@*[matches(., 'word')]]", x)
test1 <- getNodeSet(doc, "//x:w[matches(., '^word$')]", x)
test1 <- getNodeSet(doc, "//x:w[@*[matches(., '^word$')]]", x)
Update: If I use the term matches with any combination I get the following error and an empty list as result.
xmlXPathCompOpEval: function matches not found
XPath error : Unregistered function
XPath error : Invalid expression
XPath error : Stack usage error
Error in xpathApply.XMLInternalDocument(doc, path, fun, ..., namespaces = namespaces, :
error evaluating xpath expression //x:w[matches(., '^word$')]
If I look for "//x:w[@*[contains(., '^word$')]]"
based on advice below, I get the following warning and empty list as result:
Warning message:
In xpathApply.XMLInternalDocument(doc, path, fun, ..., namespaces = namespaces, :
the XPath query has no namespace, but the target document has a default namespace.
This is often an error and may explain why you obtained no results
I imagine I am just using the wrong commands. What should I change to make it work? Thanks!