3

I got some great help today with starting to understand preg_replace_callback with known values. But now I want to tackle unknown values.

$string = '<p id="keepthis"> text</p><div id="foo">text</div><div id="bar">more text</div><a id="red"href="page6.php">Page 6</a><a id="green"href="page7.php">Page 7</a>';

With that as my string, how would I go about using preg_replace_callback to remove all id's from divs and a tags but keeping the id in place for the p tag?

so from my string

<p id="keepthis"> text</p>
<div id="foo">text</div>
<div id="bar">more text</div>
<a id="red"href="page6.php">Page 6</a>
<a id="green"href="page7.php">Page 7</a>

to

<p id="keepthis"> text</p>
<div>text</div>
<div>more text</div>
<a href="page6.php">Page 6</a>
<a href="page7.php">Page 7</a>
Fred Turner
  • 139
  • 1
  • 9
  • 2
    _“how would I go about using preg_replace_callback to […]”_ - ideally, [you wouldn’t](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) … – CBroe Feb 07 '14 at 20:50
  • so I should just stick with preg_replace or str_replace in this case? – Fred Turner Feb 07 '14 at 20:55
  • 4
    If you don't know the future of $string it is best to use a HTML parser – danronmoon Feb 07 '14 at 21:05
  • [A quick fiddle](http://regex101.com/r/aV9vK6), quite tired to post an extensive answer so I've put some comments. Use `preg_replace()` with the `x` modifier and remove the `g` modifier. Some advice : **1)** If you're not sure about the input you're getting or your regex skills aren't good then just use [a parser](http://stackoverflow.com/q/3577641) **2)** This question got quite some upvotes, I don't see why but you really should post your attempts **3)** [Learn regex](http://regex.learncodethehardway.org/book) or visit the [regex chatroom](http://chat.stackoverflow.com/rooms/25767) ! – HamZa Feb 07 '14 at 21:22

2 Answers2

1

There's no need of a callback.

$string = preg_replace('/(?<=<div|<a)( *id="[^"]+")/', ' ', $string);

Live demo

However in the use of preg_replace_callback:

echo preg_replace_callback(
    '/(?<=<div|<a)( *id="[^"]+")/',
    function ($match)
    {
        return " ";
    },
    $string
 );

Demo

revo
  • 47,783
  • 14
  • 74
  • 117
0

For your example, the following should work:

$result = preg_replace('/(<(a|div)[^>]*\s+)id="[^"]*"\s*/', '\1', $string);

Though in general you'd better avoid parsing HTML with regular expressions and use a proper parser instead (for example load the HTML into a DOMDocument and use the removeAttribute method, like in this answer). That way you can handle variations in markup and malformed HTML much better.

Community
  • 1
  • 1
Botond Balázs
  • 2,512
  • 1
  • 24
  • 34