0

I need to remove a particular string using php, the string needs to start with <div class='event'> followed by any possible string but which contains $myVariable, which is then followed by </div>. How do I remove all this using preg_replace()? I have worked out it might be something like this

preg_replace("<div class='event'>(.*)" . $myVariable . "(.*)</div>", "", $content);

But I cant get it to work.

Update:

I need to remove a div and everything inside it, the div contains an event name and date but I can only delete the div based on the events name and so the date needs to be defined as practically any string.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    You won't get it to work with regex. Please precise what you are trying to achieve. Remove/replace some `
    ` tag contents? Or just retrieve some value? Please also provide a sample HTML fragment.
    – Wiktor Stribiżew Sep 21 '15 at 19:34
  • What does `$myVariable` contain? Are you sure you need a regex here and not just a simple `str_replace()`? – gen_Eric Sep 21 '15 at 19:37
  • If you have nested divs, you're better off with an html parser – Mariano Sep 21 '15 at 19:46
  • I need to remove a div and everything inside it, the div contains an event name and date but I can only delete the div based on the events name and so the date needs to be defined as practically any string – user3743076 Sep 21 '15 at 19:56
  • Where is the event name? Please post a real-life HTML code as an example to see the sample structure of the HTML document you need to remove the `
    ` from.
    – Wiktor Stribiżew Sep 21 '15 at 20:08
  • Seems like you are trying to parse html with regex http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – ford prefect Sep 21 '15 at 20:53

1 Answers1

1

Let's imagine you have a <div> and inside it, there is some text node with a specific word you define with $myVariable.

The task is:

  1. Read the document in
  2. Initialize DOM
  3. Collect the <div> tags with .nodeValue containing $myVariable text
  4. Remove those tags from the DOM
  5. Return updated DOM

The code for that algorithm is below (DOM is initialized with a HTML string in the demo):

$html = "<<YOUR_HTML_STRING>>"
$dom = new DOMDocument;               // Declaring the DOM
$dom->loadHTML($html);                // Initializing the DOM with an HTML string
$myVariable = "2015-09-12";           // Your dynamic variable

$xpath = new DOMXPath($dom);          // Initializing the DOMXpath
$divs = $xpath->query("//div[contains(.,'$myVariable')]"); // Collecting DIVs 
                                                           // having $myVariable
foreach($divs as $div) { 
   $div->parentNode->removeChild($div); // Removing the DIVs
}

echo $dom->saveHTML();                  //  Getting the updated DOM

See IDEONE demo

Note that you can force DOMDocument to omit adding !DOCTYPE using the following to declare and initialize DOM:

$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563