8

Possible Duplicate:
regex for URL including query string

I have a text or message.

Hey! try this http://www.test.com/test.aspx?id=53

Our requirement is to get link from a text.We are using following code

List<string> list = new List<string>();
Regex urlRx = new
Regex(@"(?<url>(http:|https:[/][/]|www.)([a-z]|[A-Z]|[0-9]|[/.]|[~])*)",
RegexOptions.IgnoreCase);

MatchCollection matches = urlRx.Matches(message);
foreach (Match match in matches)
{
   list.Add(match.Value);
}
return list;

It gives url but not the complete one.Output of the code is

http://www.test.com/test.aspx

But we need complete url like

http://www.test.com/test.aspx?id=53

Please suggest how to resolve that issue.Thanks in advance.

Community
  • 1
  • 1
PrateekSaluja
  • 14,680
  • 16
  • 54
  • 74
  • Have a look at this [stack Overflow](http://stackoverflow.com/questions/2343177/regex-for-url-including-query-string) question, I believe it will solve your problem. – Bibhu Feb 03 '12 at 07:24
  • Check out [this page](http://daringfireball.net/2010/07/improved_regex_for_matching_urls) for a complete Regex for finding and URL hidden within reguler text. If you need something simpler, I think it's commented well enough that you should be able to adapt it to your particular case. – Ken Wayne VanderLinde Feb 03 '12 at 07:22

3 Answers3

17

Try this regex, returns the query string also

(http|ftp|https)://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?

You can test it on gskinner

Amar Palsapure
  • 9,590
  • 1
  • 27
  • 46
8
public List<string> GetLinks(string message)
{
    List<string> list = new List<string>();
    Regex urlRx = new Regex(@"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\&\=;\+!'\(\)\*\-\._~%]*)*", RegexOptions.IgnoreCase);

    MatchCollection matches = urlRx.Matches(message);
    foreach (Match match in matches)
    {
        list.Add(match.Value);
    }
    return list;
}

var list = GetLinks("Hey yo check this: http://www.google.com/?q=stackoverflow and this: http://www.mysite.com/?id=10&author=me");

It will find the following type of links:

http:// ...
https:// ...
file:// ...
www. ...
papaiatis
  • 4,231
  • 4
  • 26
  • 38
1

If you are using this urls later on your code (extracting a part, querystring or etc.) please consider using

Uri class combine with HttpUtility helper.

Uri uri;
String strUrl = "http://www.test.com/test.aspx?id=53";
bool isUri = Uri.TryCreate(strUrl, UriKind.RelativeOrAbsolute, out uri);
if(isUri){
    Console.WriteLine(uri.PathAndQuery.ToString());
}else{
    Console.WriteLine("invalid");
}

It could help you with this operations.

Rafał Warzycha
  • 561
  • 3
  • 18