0

Let's use 3 string examples:

Example 1:

<div id="something">I have a really nice signature, it goes like this</div>

Example 2:

<div>I like balloons</div><div id="signature-xyz">Sent from my iPhone</div>

Example 3:

<div>I like balloons</div><div class="my_signature-xyz">Get iOS</div>

I'd like to remove the entire contents of the "signature" div in examples 2 and 3. Example 1 should not be affected. I don't know ahead of time as to what the div's exact class or ID will be, but I do know it will contain the string 'signature'.

I'm using the code below, which gets me half way there.

$pm = "/signature/i";
 if (preg_match($pm, $message, $matches) == 1) {
        $message = preg_split($pm, $message, 2)[0];
    }

What should I do to achieve the above? Thanks

user6122500
  • 892
  • 1
  • 15
  • 31
  • 2
    Please **don't** attempt to use regex in PHP to parse your HTML content. Instead, your time would be better invested learning how to use an XML/HTML parser. Read the question [How Do You Parse and Process HTML/XML in PHP?](https://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) for more information. – Tim Biegeleisen Mar 12 '19 at 01:16
  • Another DOM library I have used in the past is PHPQuery, which lets you use jQuery like syntax to access the DOM. PHP core also has DomDocument and Xpath, which are both useful for this. HTML is a Hierarchical (nested) language so it's very difficult to parse with Regex. – ArtisticPhoenix Mar 12 '19 at 01:19

1 Answers1

2

You can use the following sample to build your code on it:

$dom = new DOMDocument();
$dom->loadHTML($inputHTML);
$xpathsearch = new DOMXPath($dom);
$nodes = $xpathsearch->query("//div[not(contains(@*,'signature'))]");

foreach($nodes as $node) {
    //do your stuff
}

Where the xpath:

//div[not(contains(@*,'signature'))]

will allow you to extract all div nodes for which there is no attribute that contains the string signature.

Regex should never being used in HTML/XML/JSON parsing where you can have theoretically infinite nested depth in the structure. Ref: Regular Expression Vs. String Parsing

Allan
  • 12,117
  • 3
  • 27
  • 51