0

I'm doing a script which gets a xml file and show some text in it. A sample xml structure could be like:

<documento fecha_actualizacion="20221027071750">
<metadatos>
[...]
</metadatos>
<analisis>
[...]
</analisis>
<texto>
<dl>
<dt>1. Poder adjudicador: </dt>
<dd>
[...]
</dd>
</dl>
</texto>
</documento>

I'm trying to get the html inside 'texto' element as a string ('<dl><dt>1. Poder ad[...]</dt></dd>[...]') , but when getting it, it is shown as:

Array ( [0] => SimpleXMLElement Object ( [dl] => SimpleXMLElement Object ( [dt] => Array ( [0] => 1. Poder adjudicador: [1] => 2. Tip

ordered by element (dl, dt, dd, etc). I've tried every posible solution for querying that 'texto' element (with '//texto/text()', innerhtml, node(), nodeValue(), etc.) but it always return me the same.

How could I get something like '<dl><dt>1. Poder ad[...]</dt></dd>[...]'

Thank you!!

I have tried with selectors:

$texto = $xml->xpath('//texto/text()');
$texto = $xml->xpath('//texto/innerXml()');
$texto = $xml->xpath('//texto/node()');
$texto = $xml->xpath('//texto/nodevalue()');
  • The sample xml in your question is not well formed; please edit your question and fix so others can replicate the problem. – Jack Fleeting Nov 03 '22 at 11:55

1 Answers1

0

You need to fetch the parent nodes (texto), iterate and save each child node as XML:

$documento = new SimpleXMLElement(getXMLstring());

foreach ($documento->xpath('//texto') as $texto) {
  $result = '';
  foreach ($texto->children() as $content) {
    $result .= $content->asXML(); 
  }

  var_dump($result);
}

Output:

string(59) "<dl>
<dt>1. Poder adjudicador: </dt>
<dd>
[...]
</dd>
</dl>"

SimpleXML is an abstraction focused on element nodes. It has limits. If the texto element can have non-element child nodes they will not be included. In this case you need to use DOM.

$document = new DOMDocument();
$document->loadXML(getXMLString());
$xpath = new DOMXpath($document);

foreach ($xpath->evaluate('//texto') as $texto) {
  $result = '';
  foreach ($texto->childNodes as $content) {
    $result .= $document->saveXML($content); 
  }

  var_dump($result);
}

Additionally DOMXpath::evaluate() supports full Xpath 1.0, including expressions that return scalar values.

ThW
  • 19,120
  • 3
  • 22
  • 44