0

I am trying to remove following pattern from a string:

<div class="main_title">Content 1</div> 

where 'Content 1' may vary between strings.

The following does not seem to be working:

$output = preg_replace('<div class="main_title">.*</div>', " ", $output);

Am I missing something obvious?

George Cummins
  • 28,485
  • 8
  • 71
  • 90
Luke G
  • 1,741
  • 6
  • 23
  • 34
  • 2
    `Am I missing something obvious?` You're trying to parse HTML with regular expressions. –  May 28 '13 at 21:38
  • 2
    Do not parse HTML with a regular expression! http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Micha Wiedenmann May 28 '13 at 21:38
  • See [these](http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-xml/3577662#3577662) [answers](http://stackoverflow.com/questions/3820666/grabbing-the-href-attribute-of-an-a-element/3820783#3820783) for a [better way](http://stackoverflow.com/questions/4979836/noob-question-about-domdocument-in-php/4983721#4983721). – George Cummins May 28 '13 at 21:41

2 Answers2

3

The DOM method is probably superior because you don't have to worry about case sensitive, whitespace, etc.

$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//div[@class="main_title"]') as $node) {
    $node->parentNode->removeChild($node);
}
$output = $dom->saveHTML();

It's possible to do with regex, especially if you can trust that your input will follow a very specific format (no extra whitespace, perhaps no case discrepancies, etc.) Your main issue is a lack of PCRE delimiters.

$output = preg_replace('@<div class="main_title">.*?</div>@', '', $output);
Explosion Pills
  • 188,624
  • 52
  • 326
  • 405
1

As others says in the comments, don't use regular expressions to parse HTML, use SimpleXML or DOMDocument instead. If you need a regex yet, you need to put the pattern delimiters in your code:

$output = preg_replace('#<div class="main_title">.*</div>#', " ", $output);
m4t1t0
  • 5,669
  • 3
  • 22
  • 30