0

I'm having problem understanding Regular Expression. Here is what I've got

$pat = "/<[^>]*>/";

This pattern works well in removing all the HTML tags. But when it's used to remove <?php ?> tags, it has problem when -> exist in between the tag.

i.e

<?php
  $obj->name;
  $obj->reset();
?>
some other things outside

Intended result

some other things outside

The actual result

  name;
  $obj->reset();
?>
some other things outside

So, how can I exclude the -> in my search?

Sufendy
  • 1,212
  • 2
  • 16
  • 29
  • 5
    Have you [tried a proper parser instead](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)? :) – deceze Oct 21 '11 at 02:28
  • 1
    You'll also have a problem with greater than, right shift, and strings containing '>'. Better use a parser ;) – hair raisin Oct 21 '11 at 02:28
  • hmm.. never heard of the parser yet. Will try that out. Thanks – Sufendy Oct 21 '11 at 02:30

1 Answers1

0

You can either code in a special exception for the <?php ?> delimiters, so it takes precedence over the generic rule:

$pat = "/<[?].*?[?]>|<[^>]*>/";

Or you use an assertion (?=...) and allow -> as alternative. Wah no, that's probably too difficult.

mario
  • 144,265
  • 20
  • 237
  • 291
  • it doesn't seems to work .. hmm I think I'll just go for the parser. thanks! – Sufendy Oct 21 '11 at 02:39
  • Lacks the `/s` flag. And [`strip_tags`](http://php.net/strip_tags) would have worked too, if that's what you actually wanted. – mario Oct 21 '11 at 02:43