I have several HTML paragraphs like this (always same structure):
<p>
<!-- Gl. 1-4 -->
\( x(t) = x_0 · t^3 \)
[!equanchor? &id=`555`!]
</p>
I am extracting the 555
successfully by:
$xpath = new DomXPath($dom);
$paragraphs = $xpath->query('//p');
foreach($paragraphs as $p)
{
$ptext = $p->nodeValue;
if(strpos($ptext, 'equanchor') !== false)
{
// get equation id from anchor
preg_match('/equanchor\?\s\&id=`(.*)\`/', $ptext, $matches);
$equationids[] = (int)$matches[1];
}
}
Now I would also need the text from the HTML comment, which is <!-- Gl. 1-4 -->
, but I couldn't find out how to use the DOM parser (DomXPath) for this purpose. Unfortunately, the $p->nodeValue
nor the $p->textContent
do contain the comment text.
This answer did not help me. I tried a "sub parser" but it failed to read the $ptext
or $p
.