0

With Regex, I need to find and replace all the mailaddresses in a fully rendered HTML-page, because i want to SPAM-protect all of them. To be precise i want all addresses except them in formular-elements (because if a validation of a user-input fails, i still want to display the inserted mailaddress and not a replaced one).

To find or write a Regex to simply search mailaddresses is not a problem. The problem is the exclusion of the ones in formular-elements. Has anyone a suggestion how to resolve this problem? Is this possible in Regex?

Some examples: I want to match "...My content, mail@mail.com, more content......" But i don't want to match: "...Your mail:mail@mail.com..."

I know it would be better to parse the HTML and simply skip form-elements, but performance matters and as i said before, this task is performed every time the website is called...

Thanks for your help!

Ben
  • 447
  • 4
  • 13

1 Answers1

0

It's probably impossible. See: RegEx match open tags except XHTML self-contained tags to start with. Second regex doesn't do a very good job of "not". (Some regex support it, some don't, but all are slow at it.) Perhaps someone who is much better at regex than me might be able to help you, but I suspect doing this is impossible.

Community
  • 1
  • 1
Ariel
  • 25,995
  • 5
  • 59
  • 69