0

I want to use regex to match specified domain in my post content then remove it, this is my regex but i can't get it work
/<a[^>]+href[^(*domain.com)>]+>(.*)<\/a>/

i want to remove the a tag when specified domain occurred even if it was domain.com?a=something or in any other format!
is that could be?

slugonamission
  • 9,562
  • 1
  • 34
  • 41
  • 3
    Now, repeat after me. HTML cannot be parsed using regex. HTML cannot be parsed using regex. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – slugonamission Dec 11 '12 at 11:22
  • use an [html parser](http://en.wikipedia.org/wiki/Comparison_of_HTML_parsers) –  Dec 11 '12 at 13:04

1 Answers1

1
[^(*domain.com)>]

This won’t do what you would expect it to do. This will match any character that is not one of the following: ()>*.acdimno.

What you would want to do is match exactly domain.com.

/<a[^>]+href=([^>]+domain.com[^>]+>(.*)<\/a>/

But then again, don’t use regular expressions to parse HTML.

Community
  • 1
  • 1
poke
  • 369,085
  • 72
  • 557
  • 602