1

Hi I have the following html and want to pull out all the other links that are not http://dont-match.co.uk all the URLs to be matched are different all the ones not to be matched are the same hence I'm thinking along the lines of a negative match ie match all that are not http://dont-match.co.uk

<a href="http://match-this-url.com/">link text</a> some 
text <a href="http://match-this-diff-url.com/">link text</a> more 
text <a href="http://dont-match.co.uk/">link text</a> 
text <a href="http://match-this-different-url.com/">link text</a> 
text <a href="http://dont-match.co.uk/">link text</a>

This is what i have so far:

/(<a href="http:\/\/[dont-match.co.uk]\/[^\"]*">([\d\D]*?)<\/a>)/
hakre
  • 193,403
  • 52
  • 435
  • 836

1 Answers1

4

Use a negative lookahead (?!expression not to match):

preg_match_all('/(<a href="http:\/\/(?!dont-match\.co\.uk).*?\/[^"]*">(.*?)<\/a>)/', $str, $matches);
netcoder
  • 66,435
  • 19
  • 125
  • 142
  • FYI, you can use other characters than `/` as the regex delimiter. For example, `'~((.*?))~'` works just as well, and now you don't have to escape the slashes in the regex. – Alan Moore Jan 29 '11 at 04:42
  • Looks like this answer is wrong. See the follow up (duplicate) question by the OP: [how do i stop this reg exp matching all this](http://stackoverflow.com/questions/4837313/how-do-i-stop-this-reg-exp-matching-all-this) (Not saying it's trivial, you might find this answer useful: http://stackoverflow.com/a/13994665/367456) – hakre Dec 24 '12 at 16:21
  • @netcoder, do you see some reasoning why the OP opened a second question that looks like duplicate later claiming your regex here does not work? Not saying that it doesn't work in context of this question here, I was just wondering about the coincidence of the duplicate question. – hakre Dec 25 '12 at 02:53
  • Oh yes, was even answered. Too late for me. Thanks for the feedback and sorry for the interruption. – hakre Dec 25 '12 at 02:59