I have a string like
<p begin="00:35:47.079" end="00:35:49.119" region="r8" style="s1">
<span style="s2" tts:backgroundColor="black">Hello I am a fireman. Good morning</span>
<br/>
<span style="s2" tts:backgroundColor="black">Why do you </span>
<span style="s9" tts:backgroundColor="black">insist on that?</span>
</p>
I am trying to output it like
Hello I am a fireman. Good morning
Why do you insist on that?
I've tried this, which ultimately is outputting it to a file.
$xmlObject = simplexml_load_string($delivery, 'SimpleXmlElement', LIBXML_NOCDATA);
$xmlArray = json_decode(json_encode((array) $xmlObject), TRUE);
foreach($xmlArray['body']['div']['p'] as $p_tag) {
if (!is_string($p_tag['span'])) {
$multiLine = '';
foreach ($p_tag['span'] as $line) {
if (is_string($line)) {
$multiLine .= $line . "\n";
}
}
$p_tag['span'] = $multiLine;
}
}
foreach($toPrint as $line) {
if (!isset($line['begin'])) {
continue;
}
$endSpace = '';
if (!$shrunk) {
$endSpace = ' ';
}
fwrite($fileOpen,"\n\n" . $line['begin'] . ' --> ' . $line['end'] . $endSpace . "\n" . $line['content']);
}
and then printing out $p_tag line for line, but it will of course produce
Hello I am a fireman. Good morning
Why do you
insist on that?
From here, I've also tried
$value = $Dom->documentElement->nodeValue;
$lines = explode("\n", $value);
$lines = array_map('trim', $lines); // remove leading and trailing whitespace
$lines = array_filter($lines); // remove empty elements
foreach($lines as $line) {
echo htmlentities($line);
}
But that produces something like
Hello I am a fireman.Good morningWhy do youinsist on that?
When I var_dump the $p_tag, it produces something like this
["span"]=>
array(3) {
[0]=>
string(34) "'Hello I am a fireman. Good morning"
[1]=>
string(28) "Why do you "
[2]=>
string(28) "insist on that?"
}
["br"]=>
array(0) {
}
So the break gets put out of order, so I can't rely on that when looking at the XML object. The spans are grouped, the breaks are in a separate location, so there's no way in that case to put the line breaks in the location that they were in the original string.
should be on new lines. It could be that three sentences are on one line, and five words within the same sentence are on separate lines. – Michael Millar Apr 07 '21 at 09:32
`, it only looks for spans. You need to modify it so it processes the line breaks too. – ADyson Apr 07 '21 at 09:33