0

This is bugging me the whole day now. Assume this simple non valid HTML

<p clneck="something">my neck hurts</p>

Now I would like to use preg_replace to replace neck with head

Of course a simple

preg_replace("/neck/", "head")

would give me

<p clhead="something">my head hurts</p>

I guess you got the point.

I tried the build in DOMDocument, but it failed twice: it's not build for HTML5 and it still failed on some heavenly nested tags.

chris85
  • 23,846
  • 7
  • 34
  • 51
Juergen Schulze
  • 1,515
  • 21
  • 29
  • How did it fail? What did you try? Why did `ass` become `head` with that regex? These also aren't multibyte characters.. – chris85 Mar 24 '16 at 19:12
  • 1
    First of all check how to use [preg_replace](http://www.tutorialspoint.com/php/php_preg_replace.htm). You are missing third parameter. @chris85: I edited the question, so it's neck now :D –  Mar 24 '16 at 19:14
  • @noob aha makes a bit more sense now. I modified the `class` attribute to match, regex behavior. Still unclear on multibyte issue and OPs usages. – chris85 Mar 24 '16 at 19:20
  • Why don't you use a DOM parser? http://simplehtmldom.sourceforge.net/ – Claudio King Mar 25 '16 at 09:12

1 Answers1

0

Can be done by using backreferences like $1.

Simple example just handling p tags:

$input = '<p clneck="something">my neck hurts</p>';
$output = preg_replace('/(<p\s+[^>]+>[^<]*)neck([^<]*<\/p>)/i', '$1head$2', $input);

To handle all tags it gets a little bit more complicated, because we'll need references (\\2)in the matching regex, too:

$input = '<p clneck="something">my neck hurts</p><div idneck="foo">my neck hurts, too</div>';
$output = preg_replace('/(<(\w+)(\s+[^>]+)>[^<]*)neck([^<]*<\/\\2>)/i', '$1head$4', $input);
echo $output;
maxhb
  • 8,554
  • 9
  • 29
  • 53