0

I want to replace only

<span class="google-src-text" style="direction: ltr; text-align: left">any character</span>

line by line with space in this source http://persianfox.ir/html.html and my php code is

$content = file_get_contents('path/to/html.html');
$content = str_replace('>', ">\n", $content);

echo preg_replace('/<span class="google-src-text" style="direction: ltr; text-align: left">.*.<\/span>/', ' ', $content);

but this code will replace all the content that beign with <span class="google-src-text" style="direction: ltr; text-align: left"> and last </span>.

Mokhtarabadi
  • 349
  • 1
  • 3
  • 11

2 Answers2

1

This one works if you have no HTML in your "any character".

/<span class="google-src-text" style="direction: ltr; text-align: left">([^<]{1,})<\/span>/
Andresch Serj
  • 35,217
  • 15
  • 59
  • 101
1

* is greedy by default, you need to change it to lazy, like so:

preg_replace('/<span class="google-src-text" style="direction: ltr; text-align: left">.*?<\/span>/', ' ', $content);
//                                                               Note the question mark ^

This will match *up to the first </span>, note that if you have a nested span inside, it will not fetch all the way to the end.

That's why You shouldn't parse HTML with Regex and should instead use a proper HTML DOM parser

Community
  • 1
  • 1
Madara's Ghost
  • 172,118
  • 50
  • 264
  • 308