Removing urls with preg_replace

Question

I want to remove hyperlinks in php

Here is the html

<a rel="nofollow" href="http://www.clickansave.net/download/somethingelse" title="Download Now" target="_blank"><img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"></a>

I want to remove everything shown above if clickansave.net is found in href.I need a solution that use preg_replace and not dom for the following reason :

I know the exact structure of the html to be deleted and there is only one occurrence on page. Dom would be overkill in this case

I tried the following

First i started by removing the

$input = preg_replace('#<img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"><\/a>#s', '' , $input,1);

From there i thought of this regex which is of course not not working

$input = preg_replace('#<a.*?<img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"><\/a>#s', '' , $input,1);

Overkill in terms of what? Would you be able to implement it using DOM? — zerkms, Apr 11 '14 at 05:36
What have you tried? You seem to be just asking for someone to write the code for you. — Devon Bessemer, Apr 11 '14 at 05:36
What `img` tag is doing in your so called regular expression? — zerkms, Apr 11 '14 at 07:35
@zerkms I want to remove the whole html that i posted , not just the `href` — user2650277, Apr 11 '14 at 15:14
**Don't use regular expressions to parse HTML. Use a proper HTML parsing module.** You cannot reliably parse HTML with regular expressions, and you will face sorrow and frustration down the road. As soon as the HTML changes from your expectations, your code will be broken. See [this SO thread](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) for examples of how to properly parse HTML with PHP modules that have already been written, tested and debugged. — Andy Lester, Apr 11 '14 at 15:17

Quixrick · Accepted Answer · 2014-04-11T15:23:17.107

What about something like this?

$string = 'This is a string <a rel="nofollow" href="http://www.clickansave.net/download/somethingelse" title="Download Now" target="_blank"><img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"></a> of text.  There are <A HREF="http://www.google.com">Lots Of Links</A> to find and replace.';

$string = preg_replace('~<a.*?</a>~i', 'NO_LINK_HERE', $string);

print $string;

This will output the following:

This is a string NO_LINK_HERE of text. There are NO_LINK_HERE to find and replace.

EDIT:

Sorry, I hadn't noticed the requirement to only replace clickansave.net URLS. Use this preg_replace instead to do that.

$string = preg_replace('~<a.*?clickansave\.net.*?</a>~i', 'NO_LINK_HERE', $string);

print $string;

That will give you this output:

This is a string NO_LINK_HERE of text.  There are <A HREF="http://www.google.com">Lots Of Links</A> to find and replace.

Vitthal · Answer 2 · 2014-04-11T10:35:13.517

Suppose this is your string with hyperlink to a image or text which contain domain example.net,

$string = '<a href="http://www.example.net/download/somethingelse" title="Download Now" target="_blank"><img src="http://banners.coolmirage.com/download_bt3.png" border="0" alt="Download"></a>';

If you want to remove hyperlink if it contain example.net, use

$pattern = '~(<a href="[^"]*example.net[^"]*" [^>]*>)\s*(.+)\s*(</a>)$~';
//                          1                             2       3

$result = preg_replace($pattern, '$2', $string);

Now $result will contain image or text between anchor (<a ..) tag.

If you want to remove any hyperlink, use

$pattern = '~(<a href="[^"]*" [^>]*>)\s*(.+)\s*(</a>)$~';

I want to remove the **whole** html that i have posted ,not just the hyperlink — user2650277, Apr 11 '14 at 15:13

Removing urls with preg_replace

2 Answers2