74

I want to get the size of an http:/.../file before I download it. The file can be a webpage, image, or a media file. Can this be done with HTTP headers? How do I download just the file HTTP header?

Dan Beaulieu
  • 19,406
  • 19
  • 101
  • 135

5 Answers5

103

Yes, assuming the HTTP server you're talking to supports/allows this:

public long GetFileSize(string url)
{
    long result = -1;

    System.Net.WebRequest req = System.Net.WebRequest.Create(url);
    req.Method = "HEAD";
    using (System.Net.WebResponse resp = req.GetResponse())
    {
        if (long.TryParse(resp.Headers.Get("Content-Length"), out long ContentLength))
        {
            result = ContentLength;
        }
    }

    return result;
}

If using the HEAD method is not allowed, or the Content-Length header is not present in the server reply, the only way to determine the size of the content on the server is to download it. Since this is not particularly reliable, most servers will include this information.

Fidel
  • 7,027
  • 11
  • 57
  • 81
mdb
  • 52,000
  • 11
  • 64
  • 62
  • 12
    If you use `using` it automatically disposes it. http://msdn.microsoft.com/en-us/library/yh598w02(v=vs.110).aspx – justderb Apr 16 '13 at 20:12
  • 3
    Another note, if you are using this for extremely large files `int` is not enough, you'll need to use `long ContentLength;` and `long.TryParse(xxx)` to support more than a 2.14GB size return value. – Preston Oct 16 '15 at 04:27
  • Won't http compression being enabled throw off the actual file size? – Justin Jul 17 '16 at 23:06
  • I use this method to knowing the size of this link: `http://ipv4.download.thinkbroadband.com/200MB.zip` but get an error 403! why? – Behzad Apr 21 '20 at 11:31
31

Can this be done with HTTP headers?

Yes, this is the way to go. If the information is provided, it's in the header as the Content-Length. Note, however, that this is not necessarily the case.

Downloading only the header can be done using a HEAD request instead of GET. Maybe the following code helps:

HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://example.com/");
req.Method = "HEAD";
long len;
using(HttpWebResponse resp = (HttpWebResponse)(req.GetResponse()))
{
    len = resp.ContentLength;
}

Notice the property for the content length on the HttpWebResponse object – no need to parse the Content-Length header manually.

IAmJersh
  • 742
  • 8
  • 25
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Won't `resp.ContentLength` above give you the length of the HEAD response, and not the length of the file you were interested in getting the sizeof ? – Adam Nofsinger Apr 19 '11 at 12:47
  • 1
    @Adam No. The documentation says: “The ContentLength property contains the value of the Content-Length header returned with the response.” – Konrad Rudolph Apr 19 '11 at 12:57
  • Make sure you call resp.Close() or else you can encounter timeout errors when making multiple requests at a time (my third request was timing out in a foreach loop which was solved by closing each response) – Eric Smith Mar 25 '13 at 20:56
  • 3
    @Eric In fact you should use a `Using` block here, or implement the disposable pattern to manage the lifetime of the resource explicitly. Manually calling `Close` is not enough unless you insure that it *always* happens, even in the case of error. – Konrad Rudolph Mar 25 '13 at 20:59
  • @KonradRudolph You're absolutely right. Calling Close() fixed my bug while I was testing this, but a using block is the correct way to do it. Derp. – Eric Smith Mar 26 '13 at 00:59
  • @KonradRudolph, FYI, `ContentLength` returns a `long`. Not a big deal but just in case you want to fix it. – gunr2171 May 02 '13 at 00:14
4

Note that not every server accepts HTTP HEAD requests. One alternative approach to get the file size is to make an HTTP GET call to the server requesting only a portion of the file to keep the response small and retrieve the file size from the metadata that is returned as part of the response content header.

The standard System.Net.Http.HttpClient can be used to accomplish this. The partial content is requested by setting a byte range on the request message header as:

    request.Headers.Range = new RangeHeaderValue(startByte, endByte)

The server responds with a message containing the requested range as well as the entire file size. This information is returned in the response content header (response.Content.Header) with the key "Content-Range".

Here's an example of the content range in the response message content header:

    {
       "Key": "Content-Range",
       "Value": [
         "bytes 0-15/2328372"
       ]
    }

In this example the header value implies the response contains bytes 0 to 15 (i.e., 16 bytes total) and the file is 2,328,372 bytes in its entirety.

Here's a sample implementation of this method:

public static class HttpClientExtensions
{
    public static async Task<long> GetContentSizeAsync(this System.Net.Http.HttpClient client, string url)
    {
        using (var request = new System.Net.Http.HttpRequestMessage(System.Net.Http.HttpMethod.Get, url))
        {
            // In order to keep the response as small as possible, set the requested byte range to [0,0] (i.e., only the first byte)
            request.Headers.Range = new System.Net.Http.Headers.RangeHeaderValue(from: 0, to: 0);

            using (var response = await client.SendAsync(request))
            {
                response.EnsureSuccessStatusCode();

                if (response.StatusCode != System.Net.HttpStatusCode.PartialContent) 
                    throw new System.Net.WebException($"expected partial content response ({System.Net.HttpStatusCode.PartialContent}), instead received: {response.StatusCode}");

                var contentRange = response.Content.Headers.GetValues(@"Content-Range").Single();
                var lengthString = System.Text.RegularExpressions.Regex.Match(contentRange, @"(?<=^bytes\s[0-9]+\-[0-9]+/)[0-9]+$").Value;
                return long.Parse(lengthString);
            }
        }
    }
}
Daria
  • 91
  • 5
1
WebClient webClient = new WebClient();
webClient.OpenRead("http://stackoverflow.com/robots.txt");
long totalSizeBytes= Convert.ToInt64(webClient.ResponseHeaders["Content-Length"]);
Console.WriteLine((totalSizeBytes));
Umut D.
  • 1,746
  • 23
  • 24
  • 2
    This is a great solution, especially if you're already using WebClient to download the file and just want to add checking the file length first. – ScottFoster1000 Oct 22 '18 at 19:34
0
    HttpClient client = new HttpClient(
        new HttpClientHandler() {
            Proxy = null, UseProxy = false
        } // removes the delay getting a response from the server, if you not use Proxy
    );

    public async Task<long?> GetContentSizeAsync(string url) {
        using (HttpResponseMessage responce = await client.GetAsync(url))
            return responce.Content.Headers.ContentLength;
    }
Ilya
  • 59
  • 6