0

Hi I would like to remove from a parent id or class all html code

<?php
    $html = '<div class="m-interstitial"><div class="m-interstitial">
<div class="m-interstitial__ad" data-readmore-target="">
<div class="m-block-ad" data-tms-ad-type="box" data-tms-ad-status="idle" data-tms-ad-pos="1">
<div class="m-block-ad__label m-block-ad__label--report-enabled"><span class="m-block-ad__label__text">Advertising</span> <button class="m-block-ad__label__report-link" title="Report this ad" data-tms-ad-report=""> </button></div>
<div class="m-block-ad__content">&nbsp;</div>
</div>
</div>
<button class="m-interstitial__unlock-btn" data-readmore-unlocker=""> <span class="m-interstitial__unlock-btn__text">Read more</span>
</button></div>';


// I tried it with below code but it does not work

//$remove = preg_replace('#<div class="m-interstitial">(.*?)</div>#', '', $html); 
$remove = preg_replace('#<div class="m-interstitial">(.*?)</div>#s', '', $html);
var_dump($remove); // result = normally I want the result is empty "" but it seems does not works.

my preg_replace does not works as I wish. Any ideas ?

thank you

AlexD
  • 3
  • 2
  • 1
    Does this answer your question? [in PHP, how to remove specific class from html tag?](https://stackoverflow.com/questions/2108663/in-php-how-to-remove-specific-class-from-html-tag) – Jax-p Aug 12 '21 at 08:23
  • What is ` – brombeer Aug 12 '21 at 08:50
  • Sorry I just edited. My first time I use Stackoverflow. You can have a look when you have time to help me. Thank you – AlexD Aug 12 '21 at 10:19
  • Required reading: [RegEx match open tags except XHTML self-contained tags](https://stackoverflow.com/q/1732348/697154) – Yoshi Aug 12 '21 at 10:24
  • 1
    whilst you can use regex to manipulate html, dont! Instead use a html parser like [domdocument](https://www.php.net/manual/en/class.domdocument.php) https://3v4l.org/ZBmDZ – Lawrence Cherone Aug 12 '21 at 10:51

2 Answers2

0

Based on your code example, why don't you just set $html = ''; if that is what you want? If you have differing HTML, then use XPath to find matches:

<?php
$html = '<div class="m-interstitial">
    <div class="m-interstitial">
        <div class="m-interstitial__ad" data-readmore-target="">
            <div class="m-block-ad" data-tms-ad-type="box" data-tms-ad-status="idle" data-tms-ad-pos="1">
                <div class="m-block-ad__label m-block-ad__label--report-enabled"><span class="m-block-ad__label__text">Advertising</span> <button class="m-block-ad__label__report-link" title="Report this ad" data-tms-ad-report=""> </button></div>
                <div class="m-block-ad__content">&nbsp;</div>
            </div>
        </div>
        <button class="m-interstitial__unlock-btn" data-readmore-unlocker=""> <span class="m-interstitial__unlock-btn__text">Read more</span></button>
    </div>';

libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->omitXmlDeclaration = true;
$dom->preserveWhiteSpace = false;
$dom->validateOnParse = false;
$dom->strictErrorChecking = false;
$dom->formatOutput = false;
$dom->loadHTML('<?xml encoding="utf-8" ?>'.$html);
libxml_clear_errors();
libxml_use_internal_errors(false);

$xpath = new DOMXPath($dom);
$child = $xpath->query("(//div[@class='m-interstitial'])[1]");
$parent = $child[0]->parentNode;
$parent->removeChild($child[0]);
echo $dom->saveXML($dom->documentElement);

I am not 100% sure if this is what you want to do, but in theory, using XPath/DOM would be used like this.

Resulting in a empty HTML (since you want to filter out the parent or root element of your html).

<html><body/></html>
Erik Pöhler
  • 742
  • 2
  • 9
  • 22
  • Because before the code and after can have some another html code, I replied you. Thank you. Great answer – AlexD Aug 12 '21 at 11:45
0

I just do almost the same but your seems better

    $doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$styles = $xpath->query('//div[@class="m-interstitial"]');
if ($styles) {
  foreach ($styles as $style) {
    $style->textContent = "";
  }
}
$html = $doc->saveHTML();


var_dump($html );
AlexD
  • 3
  • 2
  • The extra checks and configs in my code snippet are meant to prevent some common problems with HTML5 (Unlike XML or the outdated XHTML Standard – HTML5 isn't strict but very "loose"). Also this prevents DOMDocument to throw errors if you pass invalid/incomplete or erroneous HTML. For removing ` – Erik Pöhler Aug 13 '21 at 11:16