1

How would I replace all span tags (and whatevers inside them) that have the class pagenum pncolor with an empty line. str_replace wouldn't work because the name is different for all of them, so I assume I'd use preg_replace, but I'm sure how that works.

<span class='pagenum pncolor'><a id='page_001' name='page_001'></a>001</span>
<p>Some text</p>

<span class='pagenum pncolor'><a id='page_130' name='page_130'></a>130</span>
<p>Some text</p>
<p>Some text</p>
<p>Some text</p>

<span class='pagenum pncolor'><a id='page_120' name='page_120'></a>120</span>
<p>Some text</p>

<span class='pagenum pncolor'><a id='page_100' name='page_100'></a>100</span>
<p>Some text</p>
usertest
  • 27,132
  • 30
  • 72
  • 94
  • Note that while if this is a single case, a regex could probably work, for more general usage you might want to use an HTML parser instead of regex, and then something like XPath to quickly pull out exactly the items you want. – Amber Mar 09 '10 at 22:35
  • possible duplicate of [How to parse and process HTML with PHP?](http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-with-php) – PeeHaa Jan 16 '12 at 19:59

3 Answers3

2

Use this regexp: #<span class='pagenum pncolor'>.*?</span>#si

Crozin
  • 43,890
  • 13
  • 88
  • 135
1

I'm going to mention the obligatory: You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML.

However, I'm guilty of using regexes in situations like this also... And if I were to do so, I'd use @andreas's answer.

Community
  • 1
  • 1
Josh
  • 10,961
  • 11
  • 65
  • 108
0

assuming that $text = {THE_HTML_STRING_YOU_POSTED_IN_YOUR_QUESTION};

you can try:

preg_replace("/<span class='pagenum pncolor'>(.*)<\/span>/",'',$text);
Andreas
  • 5,305
  • 4
  • 41
  • 60
  • 2
    That would replace everything between the first `` to the last ``. Put a `?` after your `*` to fix that. – Chad Birch Mar 09 '10 at 22:36
  • You are right! Making the quantifier LAZY will solve the problem Please check http://www.regular-expressions.info/repeat.html that states that laziness will result in more CPU cycles due to backtracking. – Andreas Mar 10 '10 at 19:35