0

I am reading files into an array; here is the relevant code; a new DiskReader is created for each file and path is determined using OpenFileDialog.

 class DiskReader{
// from variables section:
    long MAX_STREAM_SIZE = 300 * 1024 * 1024; //300 MB


    FileStream fs;
    public Byte[] fileData;

...

// Get file size, check it is within allowed size (MAX)STREAM_SIZE), start process including progress bar.

using (fs = File.OpenRead(path))
            {
                if (fs.Length < MAX_STREAM_SIZE)
                {
                    long NumBytes = (fs.Length < MAX_STREAM_SIZE ? fs.Length : MAX_STREAM_SIZE);
                    updateValues[0] = (NumBytes / 1024 / 1024).ToString("#,###.0");
                    result = LoadData(NumBytes);
                }
                else
                {
                    // Need for something to handle big files
                }

                if (result)
                {
                    mainForm.ShowProgress(true);
                    bw.RunWorkerAsync();
                }

            }

...

    bool LoadData(long NumBytes)
    {
        try
        {
            fileData = new Byte[NumBytes];
            fs.Read(fileData, 0, fileData.Length); 
            return true;
        }
        catch (Exception e)
        {
            return false;
        }
    }

The first time I run this, it works fine. The second time I run it, sometimes it works fine, most times it throws an System.OutOfMemoryException at

[Edit:

"first time I run this" was a bad choice of words, I meant when I start the programme and open a file is fine, I get the problem when I try to open a different file without exiting the programme. When I open the second file, I am setting the DiskReader to a new instance which means the fileData array is also a new instance. I hope that makes it clearer.]

  fileData = new Byte[NumBytes]; 

There is no obvious pattern to it running and throwing an exception.

I don't think it's relevant, but although the maximum file size is set to 300 MB, the files I am using to test this are between 49 and 64 MB.

Any suggestions on what is going wrong here and how I can correct it?

mharran
  • 149
  • 3
  • 15
  • 2
    What do you mean by "first time I run this"? Is it a separate program and is it fully closed and completely restarted before you call again this code? Because the exception clearly states that your memory is full. You probably keep old arrays in memory. It doesn't look like related to file streams and disk. – Al Kepp Oct 13 '17 at 13:29
  • 1
    How many files you open, or how many program instances you launch in same time? System.OutOfMemoryException rly means that you Out Of Memory. There is not problem in this part of code. Post what u left – Stas Petrov Oct 13 '17 at 13:31
  • It is possible that if you create & delete large arrays in a short space of time they are not being garbage collected quickly enough - forcing garbage collection can solve this. Here are a few links giving different views on this : https://social.msdn.microsoft.com/Forums/vstudio/en-US/52a7eb17-ac05-470c-b063-a78427cd4406/gccollect-solves-outofmemoryexception?forum=clr https://stackoverflow.com/questions/14724107/why-does-the-c-sharp-garbage-collector-not-keep-trying-to-free-memory-until-a-re https://stackoverflow.com/questions/4688348/call-gc-collect-before-throwing-outofmemoryexception – PaulF Oct 13 '17 at 13:46
  • 1
    If it hurts when you do that then **don't do that**. It is a terrible practice to read large files into an array. Stream large files. – Eric Lippert Oct 13 '17 at 13:50
  • 4
    If you're having memory problems then **use a memory profiler to find the cause of the problem** rather than guessing. – Eric Lippert Oct 13 '17 at 13:52
  • Also, I don't understand the code; why is there a conditional operator? It always assigns the file length. – Eric Lippert Oct 13 '17 at 13:53
  • @Al Kepp and Stas Petrov. If you mean physical memory then it is not - my PC has 24 GB RAM with less than 40% usage when this and other programs are running. It is possibly a contiguous memory problem as explained in this article by Eric Lippert - https://blogs.msdn.microsoft.com/ericlippert/2009/06/08/out-of-memory-does-not-refer-to-physical-memory/ – mharran Oct 13 '17 at 14:40
  • @Eric Lippert: The reason for using an array is that this is a forensics application to find JPEGS in a disk image based on on their structure rather than SOI marker which may be missing. That involves considerablehopping backwards and forwards in the file and I thought an array would give me better performance than a stream, perhaps not. The conditional operator code isn't complete, basically, if a file is larger than a specified size I want to examine it by loading sections in and out of memory. I hope that makes sense. – mharran Oct 13 '17 at 14:42
  • @PaulF That was my own first thought so I tried putting GC.Collect() before the creation of the new array but it doesn't solve the problem. – mharran Oct 13 '17 at 14:50
  • @Al Kepp "first time I run this" was a bad choice of words, I meant when I start the programme and open a file is fine, I get the problem when I try to open a different file without exiting the programme. When I open the second file, I am setting the DiskReader to a new instance which means the DataFile array is also a new instance. I hope that makes it clearer. – mharran Oct 13 '17 at 14:59
  • I think you are not understanding my point. If you write `if (a < b) { x = a < b ? c : d; }` then `x` will never be `d`, so why not simply write `if (a < b) { x = c; }`? When I see code where portions of it are impossible to execute, I think that someone has made a mistake somewhere, and maybe that's where the problem lies. – Eric Lippert Oct 20 '17 at 18:44

3 Answers3

0

If the exception is being thrown at that line only, then my guess is that you've got a problem somewhere else in your code, as the comments suggest. Reading the documentation of that exception here, I'd bet you call this function one too many times somewhere and simply go over the limit on object length in memory, since there don't seem to be any problem spots in the code that you posted.

Jahziel Villasana
  • 361
  • 1
  • 3
  • 9
0

The fs.Length property requires the whole stream to be evaluated, hence to read the file anyway. Try doing something like

    byte[] result;
    if (new FileInfo(path).Length < MAX_STREAM_SIZE)
    {
        result = File.ReadAllBytes(path);
    }

Also depending on your needs, you might avoid using byte array and read the data directly from the file stream. This should have much lower memory footprint

Jacek Glen
  • 1,506
  • 12
  • 14
  • Jacek, see my comment to Eric about ultimately need to load sections of very large files into and out of memory. – mharran Oct 13 '17 at 14:52
0

If I understand well what you want to do, I have this proposal: The best option is to allocate one static array of defined MAX size at the beginning. And then keep that array, only fill it with a new data from another file. This way your memory should be absolutely fine. You just need to store file size in a separate variable, because the array will have always the same MAX size.

This is a common approach in systems with automatic memory management - it makes the program faster when you allocate a constant size of memory at the start and then never allocate anything during the computation, because garbage collector is not run many times.

Al Kepp
  • 5,831
  • 2
  • 28
  • 48