3

I'm currently matching email addresses with the below regex (and linking them as mailto links):

/([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,64})/

However, I do not want to link anything that has already been linked before as a mailto link or within an existing link (eg. index.php?email=example@example.com) otherwise the link gets quite screwed up like the below:

<a href="mailto:<a href="mailto:example@example.com">example@example.com</a>"><a href="mailto:example@example.com">example@example.com</a></a>

Update

Here's an example of PHP I'm using to find the email addresses (sorry, didn't think it was required):

$input = "example@example.com<br><br><a href='mailto:example@example.com'>test email</a><br><br><a href='mailto:example@example.com'>example@example.com</a>";

$output = preg_replace("/([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,64})/i", "<a href='mailto:$1'>$1</a>", $input);

//output
<a href="mailto:example@example.com">example@example.com</a><br><br><a mailto:example@example.com'="" href="mailto:&lt;a href=">example@example.com</a>'&gt;test email<br><br><a mailto:example@example.com'="" href="mailto:&lt;a href=">example@example.com</a>'&gt;<a href="mailto:example@example.com">example@example.com</a>

Update 2

I also have a further regex question (if possible) - I also have the below regex to make all links target to a new window, however, i do not want anything mailto linked to go to a new window - is it possible to not target mailto links?

$output = preg_replace("/<(a)([^>]+)>/i", "<\\1 target=\"_blank\"\\2>", str_replace('target="_blank"', '', $output));
MrJ
  • 1,910
  • 1
  • 16
  • 29
  • You'll need to provide the code snippet where you actually perform linking. Simple regex pattern is not enough. – Nikola Malešević Sep 03 '12 at 12:47
  • didn't think it was required (as regex is regex), now included – MrJ Sep 03 '12 at 12:51
  • Not every regex is a regex. Different languages allow different functionalities. For example, lookbehinds work in .NET, but not in JavaScript. That's why we need to know what language you are using to process regex. – Nikola Malešević Sep 07 '12 at 11:14

1 Answers1

3

How about this regex?

/((?<!mailto:|=|[a-zA-Z0-9._%+-])[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.‌​-]+\.[a-zA-Z]{2,64}(?![a-zA-Z]|<\/[aA]>))/
Nikola Malešević
  • 1,738
  • 6
  • 21
  • 44
  • 1
    I've given this a try but it does not match anything, in fact, it seems to remove everything for some reason `$data = "example@example.com"; echo preg_replace("/((?<!(?:mailto:|=)[a-zA-Z0-9._%+-]*)[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,64})/", "$1", $data);` As for UK addresses, it does match them so not sure where that came from . . . – MrJ Sep 04 '12 at 15:28
  • Sorry for that. I've updated the answer with another version of regex. Please try it out. – Nikola Malešević Sep 04 '12 at 22:30
  • +1,Thank you! Anyway it worked if checking using http://gskinner.com/RegExr/ but with javascript I get an error by trying with: new RegExp(/((?<!mailto:|=|[a-zA-Z0-9._%+-])[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.‌​-]+\.[a-zA-Z]{2,64}(?![a-zA-Z]|<\/[aA]>))/, "g"); Error: SyntaxError: Invalid regular expression: (regexp): Invalid group. what could be? – ianaz Sep 07 '12 at 09:53
  • 1
    @ianaz OP is using `preg_replace` which supports **lookbehind** (`(?<!lookup_text)`). JavaScript does not support this functionality. – Nikola Malešević Sep 07 '12 at 11:00
  • Could you please help me to transform that in a js alternative? – ianaz Sep 07 '12 at 11:08
  • @ianaz If you have a specific problem, please create a new question where people can answer it. If you are the same person as MrJ, please log-in to that account and update this question. I'll be glad to try and work it out. – Nikola Malešević Sep 07 '12 at 11:12