0

I want to remove hyperlinks in php

Here is the html

<a rel="nofollow" href="http://www.clickansave.net/download/somethingelse" title="Download Now" target="_blank"><img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"></a>

I want to remove everything shown above if clickansave.net is found in href.I need a solution that use preg_replace and not dom for the following reason :

I know the exact structure of the html to be deleted and there is only one occurrence on page. Dom would be overkill in this case

I tried the following

First i started by removing the

$input = preg_replace('#<img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"><\/a>#s', '' , $input,1);

From there i thought of this regex which is of course not not working

$input = preg_replace('#<a.*?<img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"><\/a>#s', '' , $input,1);
Andy Lester
  • 91,102
  • 13
  • 100
  • 152
user2650277
  • 6,289
  • 17
  • 63
  • 132
  • 1
    Overkill in terms of what? Would you be able to implement it using DOM? – zerkms Apr 11 '14 at 05:36
  • 2
    What have you tried? You seem to be just asking for someone to write the code for you. – Devon Bessemer Apr 11 '14 at 05:36
  • @Devon i just added what i have tried above – user2650277 Apr 11 '14 at 06:00
  • What `img` tag is doing in your so called regular expression? – zerkms Apr 11 '14 at 07:35
  • @zerkms I want to remove the whole html that i posted , not just the `href` – user2650277 Apr 11 '14 at 15:14
  • **Don't use regular expressions to parse HTML. Use a proper HTML parsing module.** You cannot reliably parse HTML with regular expressions, and you will face sorrow and frustration down the road. As soon as the HTML changes from your expectations, your code will be broken. See [this SO thread](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) for examples of how to properly parse HTML with PHP modules that have already been written, tested and debugged. – Andy Lester Apr 11 '14 at 15:17

2 Answers2

1

What about something like this?

$string = 'This is a string <a rel="nofollow" href="http://www.clickansave.net/download/somethingelse" title="Download Now" target="_blank"><img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"></a> of text.  There are <A HREF="http://www.google.com">Lots Of Links</A> to find and replace.';

$string = preg_replace('~<a.*?</a>~i', 'NO_LINK_HERE', $string);

print $string;

This will output the following:

This is a string NO_LINK_HERE of text. There are NO_LINK_HERE to find and replace.

EDIT:

Sorry, I hadn't noticed the requirement to only replace clickansave.net URLS. Use this preg_replace instead to do that.

$string = preg_replace('~<a.*?clickansave\.net.*?</a>~i', 'NO_LINK_HERE', $string);

print $string;

That will give you this output:

This is a string NO_LINK_HERE of text.  There are <A HREF="http://www.google.com">Lots Of Links</A> to find and replace.
Quixrick
  • 3,190
  • 1
  • 14
  • 17
0

Suppose this is your string with hyperlink to a image or text which contain domain example.net,

$string = '<a href="http://www.example.net/download/somethingelse" title="Download Now" target="_blank"><img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"></a>';

If you want to remove hyperlink if it contain example.net, use

$pattern = '~(<a href="[^"]*example.net[^"]*" [^>]*>)\s*(.+)\s*(</a>)$~';
//                          1                             2       3

$result = preg_replace($pattern, '$2', $string);

Now $result will contain image or text between anchor (<a ..) tag.

If you want to remove any hyperlink, use

$pattern = '~(<a href="[^"]*" [^>]*>)\s*(.+)\s*(</a>)$~';
Vitthal
  • 327
  • 1
  • 13