0

I'm trying to remove a certain html tag with preg_replace but I can't find any way to do it, it's working if I remove the line breaks but not with.

The regex so far:

preg_replace("/<ol class=\"comment-list\">.*?<\/ol>/", "", $string);

The string in question:

<ol class="comment-list">
<time datetime="2016-03-25T15:27:34+00:00"></ol>

I'm using http://www.phpliveregex.com/ to test it.

Thanks a lot for your help!

saasmaster
  • 97
  • 10
  • 2
    [Do not use regular expression to match HTML tags](http://stackoverflow.com/a/1732454/3294262) – fusion3k Apr 02 '16 at 16:48
  • 4
    Add the `s` modifier so that the dot `.` would also match newlines: `...<\/lol>/s` – HamZa Apr 02 '16 at 16:51
  • 1
    what output you want? – Santosh Ram Kunjir Apr 02 '16 at 17:05
  • 1
    @HamZa's comment is in fact the only useful bit of information on this page. Yes, it is good advice to tell that you shouldn't parse (x)HTMLwith regexp. But the question here was pretty straightforward and was just asking how to match newlines with preg_replace. – Gfra54 Oct 27 '16 at 08:28

2 Answers2

2

As I said in small comments on this page, @HamZa's comment is in fact the only useful bit of information here : add the s modifier to your regexp so that it will match newlines.

preg_replace("/<ol class=\"comment-list\">.*?<\/ol>/s", "", $string);

It is good advice to tell that you shouldn't parse (x)HTMLwith regexp. But the question here was pretty straightforward and was just asking how to match newlines with preg_replace. This is how you do it.

Gfra54
  • 440
  • 5
  • 8
1

I know that probably this answer is not what you want, but if you want try, this is how you can remove <ol> nodes using DOMDocument:

$dom = new DOMDocument();           // Init DOMDocument object
libxml_use_internal_errors( True ); // Disable libxml errors
$dom->loadHTML( $html );            // Load HTML
$xpath = new DOMXPath( $dom );      // Init DOMXPath (useful for complex queries)

/* Search for all <ol> nodes with class “comment-list”: */
$nodes = $xpath->query( '//ol[@class="comment-list"]' );
/* Remove nodes: */
while( $nodes->length )
{
    $nodes->item(0)->parentNode->removeChild( $nodes->item(0) );
}

/* Output modified HTML: */
echo $dom->saveHTML();

Yes, these are 7 lines versus one, but I suggest you this way. Regular expressions are a great invention, but not for HTML/XML.


Community
  • 1
  • 1
fusion3k
  • 11,568
  • 4
  • 25
  • 47