How can I use regex to strip the email address from a mailto tag?

Question

I need to parse the email address from a mailto tag. I am looking for a way to do this via RegEx in C#.

<mailto:abc@xyz.com>

abc@xyz.com

Similar http://stackoverflow.com/questions/1376149/regexp-for-extracting-a-mailto-address — PiLHA, Jul 04 '13 at 13:10
b'coz i have long html string. and in html lots of this kind of tag. — Manish Sharma, Jul 04 '13 at 13:11

score 2 · Accepted Answer · edited May 23 '17 at 12:16

2

In general, it's a very bad idea to use regular expressions for HTML parsing. Instead, take a look at the Html Agility Pack. For the specific input you provided, you may use:

(?<=\<mailto:).*(?=\>)

Here's a code sample:

var emailTag = "<mailto:abc@xyz.com>";
var emailValue = Regex.Match(emailTag, @"(?<=\<mailto:).*(?=\>)").Value;
Console.WriteLine(emailValue);

edited May 23 '17 at 12:16

Community

answered Jul 04 '13 at 13:13

Alex Filipovici

score 1 · Answer 2 · answered Jul 04 '13 at 13:13

1

A simple Regex to strip anything in a mailto tag would be

<mailto:(.*?)>

answered Jul 04 '13 at 13:13

Droxx

score 0 · Answer 3 · answered Jul 04 '13 at 13:37

You could use:

[\w\d]+\@[\w\d]+\.com

[\w\d] <----This matches any letter or character. \w matches any letter. \d matches anynumber.

+ <----One or more of previous item, in this case [\w\d]+ one or more letters or numbers

\@ <----Simply matches the @ symbol but it needs to be escaped with a \ as it is a special character

[\w\d]+ <----Same again

\. <---- Same concept as the @ as . is a special character so it needs to be escaped

In your example:
[\w\d]+=abc
\@=@
[\w\d]+=xyz
\.=.
com=com

If your wanting to match special characters as well as letters and digits then just replace [\w\d]+ with [\S]+ (make sure s is capital).

[\S]+ <---Matches anything that is not a space.

You will have to do variations to include .co.uk and .org etc.

Sorry you will need a look behind also, so prefix this regex with (?<=\ — Srb1313711, Jul 04 '13 at 13:43

3 Answers3