0

2 things:

  1. Remove all hyperlinks that point to mydomain.com & retain all other hyperlinks that do not belong to this domain.

  2. For all the other URLs that remain, grab the value between tags and show it as ID.

1. About 1st task:

I have this:

$str = 'I have been searching <a href="http://www.google.com">Google</a> for all the valuable information. I have also tried <a href="http://www.yahoo.com">Yahoo</a> and I finally, ended up finding it at
<font size="1">My Site <a style="color:#0000ff;font-family:Arial,Helvetica,sans-serif" href="http://www.mydomain.com/go.php?offer=fine&amp;pid=10" target="_blank" >My Link</a></font>. So you can visit <a href="http://www.mydomain.com/go.php?offer=ok" target="_blank">My Link</a>'; 

I want this:

$str = 'I have been searching <a href="http://www.google.com">Google</a> for all the valuable information. I have also tried <a href="http://www.yahoo.com">Yahoo</a> and I finally, ended up finding it at . So you can visit '; 

What I tried:

I tried the following preg_replace but it removes all the links. I just want it to remove all links from mydomain.com and retain everything else as it is.

$pattern = "/<a[^>]*>(.*)<\/a>/iU";
$final_str = preg_replace($pattern, "$1", $str);

2. About 2nd task:

Finally, I want to end up with this:

$str = 'I have been searching <a href="http://www.google.com" id="Google">Google</a> for all the valuable information. I have also tried <a href="http://www.yahoo.com" id="Yahoo">Yahoo</a> and I finally, ended up finding it at . So you can visit '; 
Devner
  • 6,825
  • 11
  • 63
  • 104

1 Answers1

1

This should do the trick in 2 steps:

<?

$str = 'I have been searching <a href="http://www.google.com">Google</a> for all the valuable information. I have also tried <a href="http://www.yahoo.com">Yahoo</a> and I finally, ended up finding it at <font size="1">My Site <a style="color:#0000ff;font-family:Arial,Helvetica,sans-serif" href="http://www.mydomain.com/go.php?offer=fine&amp;pid=10" target="_blank" >My Link</a></font>. So you can visit <a href="http://www.mydomain.com/go.php?offer=ok" target="_blank">My Link</a>';

// removing the domain links
$pattern1 = '|<a [^>]*href="http://www.mydomain.com[^"]*"[^>]*>.*</a>|iU';
$str = preg_replace($pattern1, '', $str);

// adding IDs
$pattern2 = '|(<a [^>]+)>(.*)</a>|iU';
$str = preg_replace($pattern2, '$1 id="$2">$2</a>', $str);

Let me know if you also need to get rid of the <font size="1">My Site </font> part.

Geo
  • 12,666
  • 4
  • 40
  • 55