1

I am using the following regex to find if there is a url present in a text, however it seems to miss some URLs like:

  • youtube.be/8P0BxJO
  • youtube.com/watch?v=VrmlFL

and also some bit.ly links (but not all)

Match m = Regex.Match(nc[i].InnerText, 
   @"(http(s)?://)?([\w-]+\.)+[\w-]+(/\S\w[\w- ;,./?%&=]\S*)?");

if (m.Success)
{
    MessageBox.Show(nc[i].InnerText);
}

any ideas how to fix it?

Christian Klauser
  • 4,416
  • 3
  • 31
  • 42
user1213488
  • 473
  • 2
  • 7
  • 19

1 Answers1

0

See this related question, the first answer should help you out. The suggestion both finds links and then replaces them, so obviously just take what you need. This and this article are different approaches that should get you more or less the same result.

Another (perhaps more reliable) non-regex approach would be to tokenize the string by splitting on spaces and punctuation, and then checking the tokens to see whether they are a valid uri using Uri.IsWellFormedUriString (which only works on well formed uri's, as this question points out).

Community
  • 1
  • 1
Tyson
  • 1,685
  • 15
  • 36