2

How can I extract a valid URL from a string like this one

h*tps://www.google.com/url?q=h*tp://www.site.net/file.doc&sa=U&ei=_YeOUc&ved=0CB&usg=AFQjCN-5OX

I want to extract this part: h*tp://www.site.net/file.doc, this is my valid URL.

Ash Burlaczenko
  • 24,778
  • 15
  • 68
  • 99
Naourass Derouichi
  • 773
  • 3
  • 12
  • 38

4 Answers4

5

Add System.Web.dll assembly and use HttpUtility class with static methods. Example:

using System;
using System.Web;


class MainClass
{
    public static void Main (string[] args)
    {
        Uri uri = new Uri("https://www.google.com/url?q=http://www.site.net/file.doc&sa=U&ei=_YeOUc&ved=0CB&usg=AFQjCN-5OX");
        Uri doc = new Uri (HttpUtility.ParseQueryString (uri.Query).Get ("q"));
        Console.WriteLine (doc);
    }
}
Denis
  • 5,894
  • 3
  • 17
  • 23
1

I don't know what your other strings can look like, but if your 'valid URL' is between the first = and the first &, you could use:

(?<==).*?(?=&)

It basically looks for the first = and matches anything before the next &.

Tested here.

Jerry
  • 70,495
  • 13
  • 100
  • 144
1

You can use split function

    string txt="https://www.google.com/url?q=http://www.site.net/file.doc&sa=U&ei=_YeOUc&ved=0CB&usg=AFQjCN-5OX";

    txt.split("?q=")[1].split("&")[0];
Virus
  • 167
  • 1
  • 11
0

in this particular case with the string you posted you can do this:

string input = "your URL";
string newString = input.Substring(36, 22) ;

But if the length of the initial part of the URL changes, and also the lenght of the part you like to extract changes, then would not work.

FeliceM
  • 4,163
  • 9
  • 48
  • 75