0

I'm trying to download a portion of a file in C# using an HttpWebRequest, and am doing so successfully, but only to some degree. While my method works fine with text-based files (eg. .txt, .php, .html, etc.) it doesn't seem to play friendly with other things such as .jpg, .png, etc. which is a problem, because it should download just fine regardless of the file-type (It's just a download, not something to open the file, so file-type is irrelevant).

The problem is, while downloading text-based files properly, it doesn't play so nicely with other file-types. For example, I tried using the method for a .jpg, and it had extra data at the beginning of the file (Possibly HTTP response header?) and was roughly 200 KB larger than the actual file-size.

I'm using the following method to download the files (I've set the URLto the correct URL (Yes, I have octuple checked, it is the correct URL.), and I've set threads to 1 (thus downloading the entire file) which works for text-based files but not other file-types):

    public static string DownloadSector(string fileurl, int sector)
    {
        string result = string.Empty;
        HttpWebRequest request;
        request = WebRequest.Create(fileurl) as HttpWebRequest;

        //get first 1000 bytes
        request.AddRange(sectorSize*sector, ((sector + 1) * sectorSize) - 1);
        //request.

        Console.WriteLine("Range: " + (sectorSize * sector) + " - " + (((sector + 1) * sectorSize) - 1));

        // the following code is alternative, you may implement the function after your needs
        using (WebResponse response = request.GetResponse())
        {
            Console.WriteLine("Content length:\t" + response.ContentLength);
            Console.WriteLine("Content type:\t" + response.ContentType);
            using (StreamReader sr = new StreamReader(response.GetResponseStream()))
            {
                result = sr.ReadToEnd();
            }
        }
        return result;
    }

So, any idea what the problem is and how to fix this?

Joseph Caruso
  • 175
  • 1
  • 11
  • Check what the response says. – Blorgbeard Aug 25 '16 at 23:12
  • What is "sector?" – Robert Harvey Aug 25 '16 at 23:12
  • @RobertHarvey "sector" is just the portion of the file to download. So if there's a 10mb file, and you're using 2 threads, each thread downloads a sector where sectorSize = filesize/threads, aka 2 sectors of sectorSize 5mb – Joseph Caruso Aug 25 '16 at 23:14
  • Hmm... It would be interesting to see if that yields any tangible speed improvements. I doubt it. – Robert Harvey Aug 25 '16 at 23:17
  • @RobertHarvey back in the old days, I used FlashGet download manager on my 56k modem connection. It did multiplexed http downloads, and did indeed make a significant difference in speed. I was able to max out my download bandwidth using multiple download threads to the same server, where it wouldn't get close to that with a standard browser download. I don't know how much difference it would make nowadays. – Blorgbeard Aug 26 '16 at 04:02

1 Answers1

2

The HTTP body of an image response is not a bytestream that you can use in an image viewer directly. Images are binary, while HTTP only allows for strings.

Instead, the HTTP body in this case is typically (depending on your content negotiation i.e. your accept/encoding headers) a Base64 string.

So change this

return result;

to this

return Convert.FromBase64String(result);

(and change your return type to byte[]).

If that doesn't work, visually inspect your request and response headers and check for compression such as gzip or deflate... see also this answer.

Community
  • 1
  • 1
John Wu
  • 50,556
  • 8
  • 44
  • 80
  • Well, it threw an exception complaining it contained a non base-64 character. – Joseph Caruso Aug 25 '16 at 23:18
  • Check for a `content-encoding` header. Also, can you paste a snip of the response so we can all see it? – John Wu Aug 25 '16 at 23:20
  • There's not really any way to post a "snip" of the response seeing as it's a long bunch of nonsense that makes the image, but here's the first bit of it and part of the file itself: ���� JFIF �� ;CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 95 �� – Joseph Caruso Aug 25 '16 at 23:25
  • Here's the same data using a hex-editor: ef bf bd ef bf bd ef bf bd ef bf bd 00 10 4a 46 49 46 00 01 01 00 00 01 00 01 00 00 ef bf bd ef bf bd 00 3b 43 52 45 41 54 4f 52 3a 20 67 64 2d 6a 70 65 67 20 76 31 2e 30 20 28 75 73 69 6e 67 20 49 4a 47 20 4a 50 45 47 20 76 36 32 29 2c 20 71 75 61 6c 69 74 79 20 3d 20 39 35 0a ef Nothing really sensible even in the text portion, nor when I saved the response stream as a text file. – Joseph Caruso Aug 25 '16 at 23:26
  • If it's coming back as binary then you will need to use a `BinaryReader`. Check out [this](http://stackoverflow.com/questions/2368115/how-to-use-httpwebrequest-to-pull-image-from-website-to-local-file) article for sample code. – John Wu Aug 25 '16 at 23:30
  • @JohnWu Yep, took some modification to the function to support the information found on the BinaryReader for downloading it, but it works perfectly now! Thank you very much! :) – Joseph Caruso Aug 25 '16 at 23:51