Similar to this issue, I'm using the Simple HTML DOM Library to load a block of HTML, and parse the results. I want to return a block of HTML, which has been modified. This is working well enough for many things, (like removing attributes from an element) but I'm having trouble trying to remove nested elements altogether.
Imagine I have something like so:
<figure class="figure_class">
<br />
<p class=some_class">Stuff in here</p>
<p></p>
</figure>
I want to select this figure element, and then process all it's children, removing these extra br and p tags. I'm doing the following:
/**
* Given an HTML string, we'll clean all whitespace tags from inside the figure.
*/
function cleanCaptionTags($value) {
$html = str_get_html($value);
// Handle all Caption Classes.
foreach($html->find('figure[class*=figure_class]', 0)->children() AS $my_tag) {
if($my_tag->tag == 'br') {
// This gets selected.
$my_tag = null;
}
elseif($my_tag->tag == 'p' && $my_tag->class == null) {
// This also gets selected.
$my_tag = null;
}
}
// Return the updated HTML.
return $html->save();
}
When I do element manipulations, this works. My selectors, and loop works. However, the children are never removed. I've used: tag = null, tag = '', and unset(tag). All result the same. I also tried $html->load($html->save()), which was suggested in the linked post. None of these worked.
Any thoughts?