-2

I tried to read a 2+ gb file in two ways, the first:

var file = File.ReadAllBytes(filepath);

returns an exception, file over 2gb.

Second way:

var file = ReadAllBytes(filepath);

public byte[] ReadAllBytes(string fileName)
{
    byte[] buffer = null;

    using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
    {
        buffer = new byte[fs.Length];
        fs.Read(buffer, 0, (int)fs.Length);
    }

    return buffer;
}

Exception: "Array dimensions exceeded supported range." My goal is to send the file in the body of http request (using WebClient class).

Any example of how to read large files?

Thanks

Aa Yy
  • 1,702
  • 5
  • 19
  • 34
  • 4
    Why are you trying to put 2gb file into memory? You are already using `FileStream` which you can use to stream chunks instead of full file. – Vidmantas Blazevicius Mar 13 '18 at 12:35
  • 1
    it depends on the intent. What are you going to do with the file after? the general idea is to read a small buffer, process it, and read the next chunk. – Anton Sizikov Mar 13 '18 at 12:36
  • @VidmantasBlazevicius can you post an example? AntonSizikov i need to post the file in a body request, using webcliend – Aa Yy Mar 13 '18 at 12:37
  • 1
    @AaYy Here is a another question about how to read file in chunks rather than full so I'd just be reposting the same code https://stackoverflow.com/questions/6865890/how-can-i-read-stream-a-file-without-loading-the-entire-file-into-memory On another note, if you are going to be posting 2gb file in a body request then you might have bigger problems here. – Vidmantas Blazevicius Mar 13 '18 at 12:39
  • Possible duplicate of? https://stackoverflow.com/a/26954016/84206 – AaronLS Mar 13 '18 at 12:39
  • 5
    No sane HTTP server is going to accept a >2 GB body in an HTTP request. Even those that do will likely hit request timeout issues. Unless you already know your server is set up for this extraordinary event, your approach itself may be flawed. – Jeroen Mostert Mar 13 '18 at 12:39
  • @AaYy Requests can be streamed. Stream in the file and stream out the request. Almost all solutions that handle large files do it in some way similar to that. Only load in what you need, process portions, and then move through the file in chunks. – AaronLS Mar 13 '18 at 12:41
  • If you _really_ need a 2GB array (which you don't here) then check out https://learn.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/runtime/gcallowverylargeobjects-element . – mjwills Mar 13 '18 at 12:44
  • Just do `fs.CopyTo(httpRequestStreamHere)`, no need to read it all to memory. Or even better, since you are using WebClient: `webClient.UploadFile("url", fileName)` – Evk Mar 13 '18 at 13:02

2 Answers2

1

You can try this:

public void ProcessLargeFile(string fileName)
{
    int bufferSize = 100 * 1024 * 1024; // 100MB
    byte[] buffer = new byte[bufferSize];
    int bytesRead = 0;

    using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
    {
        while ((bytesRead = fs.Read(buffer, 0, bufferSize)) > 0)
        {
            if (bytesRead < bufferSize)
            {
                // please note array contains only 'bytesRead' bytes from 'bufferSize'
            }

            // here 'buffer' you get current portion on file 
            // process this
        }
    }
}

That will allow you to process file by 100MB portions, you can change this value to required one.

Alexey Klipilin
  • 1,866
  • 13
  • 29
-1

You are running into a rather old Limit, the 2 GiB Limit for User Mode Virtual Adress Space. You can lift it somewhat with the right compiler/manifest switch and/or running the same programm in x64 mode. You can read about it in detail here: https://msdn.microsoft.com/en-us/library/windows/desktop/aa366778.aspx?f=255&MSPPError=-2147217396#memory_limits

At certain sizes, files or query resultsets can not be fully loaded into memory (due to the limit) or are better not fully loaded before starting processing (due to performace). For those cases, Enumerators are helpfull. Compare File.ReadLines with File.ReadAllLines. Thanks to using a Enumerator, ReadLines only needs to keep one line in memory at minimum - the current one. All Lines it already processed can be dropped. Ones that are still in the future may or may not be loaded already. At no point does it need the full file loaded into memory.

Unfortuantely it appears File.ReadAllBytes does not have a enumerator variant. It seems the different class BinaryReader does have such capability in teh form of ReadBytes.

Christopher
  • 9,634
  • 2
  • 17
  • 31
  • 1
    64-bit doesn't allow > 2GB arrays unless you have https://learn.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/runtime/gcallowverylargeobjects-element enabled. – mjwills Mar 13 '18 at 12:45
  • @mjwills: Thanks. I totally forgot that additional limitation. – Christopher Mar 13 '18 at 12:49