0

If you have the XML below (where I want to get back the XML without the IDX tag or it's content)

<title>test title
<no-translation>no translation</no-translation> end no translation
<idx> <i1>index</i1></idx>
end idx
</title>

When using the XPath expression .//title//text()[not(parent::i1)] OR .//title//text()[not(ancestor::idx)] I can get the output of text which breaks into array:

Text='test title
'
Text='no translation'
Text=' end no translation
'
Text=' '
Text='
end idx

However what I want to get is not the text, but the element without the 'idx' tag:

<title>test title
<no-translation>no translation</no-translation> end no translation
end idx
</title>

But so far this seems to be impossible and doesn't appear to remove any content from the XML. Is there a way of using the [not(ancestor::idx)] with './title'?

Tomalak
  • 332,285
  • 67
  • 532
  • 628
Ottertoss
  • 1
  • 2
  • 1
    XPath is a *selection* language. You can select what's there. You cannot create new XML. The output you want is new XML - it does not exist in the source document. You cannot use XPath to select that. – Tomalak Apr 04 '22 at 16:55
  • 1
    (...what you can do is use XPath to select XML nodes you want to e.g. remove from the document until the document is shaped the way you want it. But that requires a host language with the ability to modify XML documents, XPath alone is not enough.) – Tomalak Apr 04 '22 at 16:58
  • ...right, and the usual complement to XPath's selection capability is XSLT's construction/transformation capabilities, but other XPath host languages can be used as well. – kjhughes Apr 04 '22 at 16:59
  • After some more progress, I have now hit the same issue as https://stackoverflow.com/questions/9493732/difference-between-text-and-string where the results from the XPath 2.0 is now a set of nodes where only the first is taken. But if not using the 'text()' then XPath seems to ignore (or not process) the '[not(ancestor::idx)]'. – Ottertoss Apr 05 '22 at 11:25

1 Answers1

0

I did find an answer to this in the end with XPath:

string-join(.//title//text()[not(ancestor::idx)])

This was via the post - XPath Node to String

So looks like this is from a difference in the versions of XPath and how XPath returns nodes in this situation.

Ottertoss
  • 1
  • 2