0

What could be the easiest way to match all links and e-mail addresses in a string to a list array? I was using preg_match in PHP but in C# it looks like it will be way different.

Deniz Dogan
  • 25,711
  • 35
  • 110
  • 162
Semas
  • 869
  • 10
  • 22

2 Answers2

1

Assuming that you already have a working regular expression, you can use the Regex class, like this:

static readonly Regex linkFinder = new Regex(@"https?://[a-z0-9.]+/\S+|\s+@\S+\.\S+", RegexOptions.IgnoreCase);

foreach(Match match in linkFinder.Matches(someString)) {
    //Do things...
    string url = match.Value;
    int position = match.Index;
}
SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • @serhio: `\S+` should match all that. I'm primarily trying to demonstrate how to use the regex. – SLaks Jun 09 '10 at 14:16
-1

This should work for links:

https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?

Source

This should work for email addresses:

[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}

Source

npinti
  • 51,780
  • 5
  • 72
  • 96
  • -1: There are top level domains that "email regex" will fail to match (e.g. .museum TLD). And the domain should be lower case, so in fact it won't match any. Regex is the WRONG TOOL to find email addresses. – Richard Jun 09 '10 at 14:11
  • 1
    @Richard: Regexs are not the "wrong tool" to find emails. They are **exactly the right tool**. They are **wrong** tool to **parse** and **validate**, but finding strings is THE purpose of a regex. – John Gietzen Jun 09 '10 at 14:16
  • @John: for any short regex there will be valid email addresses it fails to find. (E.g. with the one in the Q, many O'Reillys will be disappointed.) – Richard Jun 10 '10 at 10:53