43

Using the WebClient class I can get the title of a website easily enough:

WebClient x = new WebClient();    
string source = x.DownloadString(s);
string title = Regex.Match(source, 
    @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>",
    RegexOptions.IgnoreCase).Groups["Title"].Value;

I want to store the URL and the page title. However when following a link such as:

http://tinyurl.com/dbysxp

I'm clearly going to want to get the Url I'm redirected to.

QUESTIONS

Is there a way to do this using the WebClient class?

How would I do it using HttpResponse and HttpRequest?

casperOne
  • 73,706
  • 19
  • 184
  • 253
Matthew Rathbone
  • 8,144
  • 7
  • 49
  • 79

8 Answers8

72

If I understand the question, it's much easier than people are saying - if you want to let WebClient do all the nuts and bolts of the request (including the redirection), but then get the actual response URI at the end, you can subclass WebClient like this:

class MyWebClient : WebClient
{
    Uri _responseUri;

    public Uri ResponseUri
    {
        get { return _responseUri; }
    }

    protected override WebResponse GetWebResponse(WebRequest request)
    {
        WebResponse response = base.GetWebResponse(request);
        _responseUri = response.ResponseUri;
        return response;
    }
}

Just use MyWebClient everywhere you would have used WebClient. After you've made whatever WebClient call you needed to do, then you can just use ResponseUri to get the actual redirected URI. You'd need to add a similar override for GetWebResponse(WebRequest request, IAsyncResult result) too, if you were using the async stuff.

Will Dean
  • 39,055
  • 11
  • 90
  • 118
17

I know this is already an answered question, but this works pretty to me:

 HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://tinyurl.com/dbysxp");
 request.AllowAutoRedirect = false;
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();
 string redirUrl = response.Headers["Location"];
 response.Close();

 //Show the redirected url
 MessageBox.Show("You're being redirected to: "+redirUrl);

Cheers.! ;)

WhySoSerious
  • 1,930
  • 18
  • 18
  • 2
    It's essential to to finally call `response.close()` (or to use a `using` statement). See https://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.getresponse(v=vs.110).aspx for details. Otherwise you may run out of connections or get a timeout when executing this code multiple times. – JimiLoe Apr 15 '15 at 11:36
  • Thanks for the addition, Jimi! – WhySoSerious May 29 '15 at 20:03
  • 1
    Isn't AllowAutoRedirect false by default? https://msdn.microsoft.com/en-us/library/system.web.services.protocols.httpwebclientprotocol.allowautoredirect%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396 – Mick Dec 08 '16 at 00:57
  • Hello Mick! The link you posted is for WebClient's AllowAutoRedirect. I searched for HttpWebRequest and found that AllowAutoRedirect is true by default! lol! I don't know why! But in this example I'm instantiating WebRequest not WebClient https://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.allowautoredirect(v=vs.110).aspx Cheers – WhySoSerious Dec 09 '16 at 01:14
  • 1
    This is a good idea, as long as the first redirect does not go to subsequent redirects - which is not unthinkable. – Menno van den Heuvel Sep 24 '20 at 10:08
6

With an HttpWebRequest, you would set the AllowAutoRedirect property to false. When this happens, any response with a status code between 300-399 will not be automatically redirected.

You can then get the new url from the response headers and then create a new HttpWebRequest instance to the new url.

With the WebClient class, I doubt you can change it out-of-the-box so that it does not allow redirects. What you could do is derive a class from the WebClient class and then override the GetWebRequest and the GetWebResponse methods to alter the WebRequest/WebResponse instances that the base implementation returns; if it is an HttpWebRequest, then set the AllowAutoRedirect property to false. On the response, if the status code is in the range of 300-399, then issue a new request.

However, I don't know that you can issue a new request from within the GetWebRequest/GetWebResponse methods, so it might be better to just have a loop that executes with HttpWebRequest/HttpWebResponse until all the redirects are followed.

casperOne
  • 73,706
  • 19
  • 184
  • 253
3

I got the Uri for the redirected page and the page contents.

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(strUrl);
request.AllowAutoRedirect = true;

HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();

strLastRedirect = response.ResponseUri.ToString();

StreamReader reader = new StreamReader(dataStream);              
string strResponse = reader.ReadToEnd();

response.Close();
Christophe Debove
  • 6,088
  • 20
  • 73
  • 124
Stephan Unrau
  • 360
  • 3
  • 5
2

In case you are only interested in the redirect URI you can use this code:

public static string GetRedirectUrl(string url)
{
     HttpWebRequest request = (HttpWebRequest) HttpWebRequest.Create(url);
     request.AllowAutoRedirect = false;

     using (HttpWebResponse response = HttpWebResponse)request.GetResponse())
     {
         return response.Headers["Location"];
     }
}

The method will return

  • null - in case of no redirect
  • a relative url - in case of a redirect

Please note: The using statement (or a final response.close()) is essential. See MSDN Library for details. Otherwise you may run out of connections or get a timeout when executing this code multiple times.

JimiLoe
  • 950
  • 2
  • 14
  • 22
0

HttpWebRequest.AllowAutoRedirect can be set to false. Then you'd have to manually http status codes in the 300 range.

// Create a new HttpWebRequest Object to the mentioned URL.
HttpWebRequest myHttpWebRequest=(HttpWebRequest)WebRequest.Create("http://www.contoso.com");    
myHttpWebRequest.MaximumAutomaticRedirections=1;
myHttpWebRequest.AllowAutoRedirect=true;
HttpWebResponse myHttpWebResponse=(HttpWebResponse)myHttpWebRequest.GetResponse();  
Shea
  • 11,085
  • 2
  • 19
  • 21
-1

The WebClient class has an option to follow redirects. Set that option and you should be fine.

Albert
  • 1,015
  • 2
  • 10
  • 28
-1

Ok this is really hackish, but the key is to use the HttpWebRequest and then set the AllowAutoRedirect property to true.

Here's a VERY hacked together example

        HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://tinyurl.com/dbysxp");
        req.Method = "GET";
        req.AllowAutoRedirect = true;
        WebResponse response = req.GetResponse();

        response.GetResponseStream();
        Stream responseStream = response.GetResponseStream();

        // Content-Length header is not trustable, but makes a good hint.
        // Responses longer than int size will throw an exception here!
        int length = (int)response.ContentLength;

        const int bufSizeMax = 65536; // max read buffer size conserves memory
        const int bufSizeMin = 8192;  // min size prevents numerous small reads

        // Use Content-Length if between bufSizeMax and bufSizeMin
        int bufSize = bufSizeMin;
        if (length > bufSize)
            bufSize = length > bufSizeMax ? bufSizeMax : length;

        StringBuilder sb;
        // Allocate buffer and StringBuilder for reading response
        byte[] buf = new byte[bufSize];
        sb = new StringBuilder(bufSize);

        // Read response stream until end
        while ((length = responseStream.Read(buf, 0, buf.Length)) != 0)
            sb.Append(Encoding.UTF8.GetString(buf, 0, length));

        string source = sb.ToString();string title = Regex.Match(source, 
        @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>",RegexOptions.IgnoreCase).Groups["Title"].Value;

enter code here

danswain
  • 4,171
  • 5
  • 37
  • 43