0

I have a block of code which gets a file and appends require data as shown below:

var srBuilder = new StringBuilder();
using (var file = new StreamReader(document.FullSourcePath, Encoding.ASCII))
            {
                while (!file.EndOfStream)
                {                 
                    var bytess = new char[numBytes];
                    file.ReadBlock(bytess , 0, bytess.Length);
                    srBuilder.Append(buff);
                }

                document.Document += srBuilder.ToString();  ////Exception occures here
            }

But when file is more than 200 MB then its throwing OutofMemoryException.

What i thought is to make length of string builder to zero as below:

 while (!file.EndOfStream)
        {  
            srBuilder.Length = 0;      //// Here         
            var bytess = new char[numBytes];
            file.ReadBlock(bytess , 0, bytess.Length);
            srBuilder.Append(buff);
        }

Is it best solution or anything else is required?

Neel
  • 11,625
  • 3
  • 43
  • 61
  • 4
    Your code does not make sense to me. Can you explain in plain English what you want to happen? – nvoigt Mar 10 '16 at 10:49
  • Its fetching file from path -> reading until end of stream occurs and then appending everything at last. – Neel Mar 10 '16 at 10:51
  • Why not use File.ReadAllLines...? – Thomas Ayoub Mar 10 '16 at 10:52
  • 2
    At 32 bits it is very difficult to have CONTIGUOUS space for a `string` of 400mb (your 200mb of ASCII file in .NET will be a 400mb unicode `string`, with each character 2 bytes). Try at 64 bits (should load without problems) or try finding a different algorithm where you don't need loading a 200mb file in memory. – xanatos Mar 10 '16 at 10:54
  • Is your question _"How can I have a ~200~ 400 MB string in memory"_? Then the counter-question is _"Why do you think you need that?"_. – CodeCaster Mar 10 '16 at 10:55
  • file is loaded with all important information and it is not the case everytime @CodeCaster – Neel Mar 10 '16 at 10:56
  • its not the case everytime @xanatos it happens once in a while – Neel Mar 10 '16 at 10:57
  • 2
    @Neel Program for the worst case, hope for the best case. – xanatos Mar 10 '16 at 10:57
  • its all about StringBuilder. have a look here: http://stackoverflow.com/questions/1769447/interesting-outofmemoryexception-with-stringbuilder – Neel Mar 10 '16 at 11:02
  • 1
    **In general** you don't load a full file of unknown size in memory. You load it one row at a time, you parse and use that row, you forget of that row and go to the next row. – xanatos Mar 10 '16 at 11:02
  • then nvoigt answer makes sence @xanatos? – Neel Mar 10 '16 at 11:03
  • Why do you think `srBuilder.Length = 0` will solve anything? This will throw away any data in the stringbuilder. Please explain why you think you need the entire file in memory at once. Hint: you probably don't. – CodeCaster Mar 10 '16 at 11:08
  • but its clearly metioned here http://stackoverflow.com/questions/1769447/interesting-outofmemoryexception-with-stringbuilder and here http://stackoverflow.com/questions/5192512/how-can-i-clear-or-empty-a-stringbuilder @CodeCaster and let us say its one of the requirement to store then? – Neel Mar 10 '16 at 11:10
  • 1
    Setting the length to 0 is effectively clearing the stringbuilder's contents. You cannot do that, as you're then simply throwing away data. That's the problem with copy-pasting code you don't understand. If it is a "requirement" that the entire text file needs to be loaded in-memory at once (which is not a real requirement, but might be your interpretation of it), then you have a problem: you can not reliably do this. Explain what you do with `document.Document` afterwards that makes you think you need this, then a real solution can be suggested. – CodeCaster Mar 10 '16 at 11:12
  • the thing is that here in code "numBytes" comes from config file and its normally 10 MB at once @CodeCaster – Neel Mar 10 '16 at 11:15
  • I think you don't understand what I'm trying to tell you. **You simply cannot read a file that large in memory at once** (at least not at 32 bit), **and you shouldn't want to**. It is entirely irrelevant to that problem that the read buffer size comes from a configuration file. See [Read Big TXT File, Out of Memory Exception](http://stackoverflow.com/questions/13415916/), [Out of memory exception reading and writing text file](http://stackoverflow.com/questions/23381474/) and so on. – CodeCaster Mar 10 '16 at 11:16
  • i am not reading entire file but chunks at a time then appending and creating string at the end @CodeCaster. if you have any useful link then share – Neel Mar 10 '16 at 11:20
  • 1
    That **is** reading the entire file in memory at once. By appending, you're letting a variable grow to the size of the file (even more so, since UTF-16 is used internally, and causing a lot of garbage in the meantime because strings are immutable). That you append this data in chunks is irrelevant, and does not solve the problem. – CodeCaster Mar 10 '16 at 11:21

2 Answers2

3

I don't know why you made it that complicated. All you need is a single line:

document.Document += File.ReadAllText(document.FullSourcePath, Encoding.ASCII);

If this throws an exception, then yes, maybe you don't have enough memory.

nvoigt
  • 75,013
  • 26
  • 93
  • 142
  • i had 50 rounds of testing and it threw exception once. is it problem? i had some debuggers in that case – Neel Mar 15 '16 at 10:43
2

Your ultimate goal is to assign a 200 MB ASCII file to the document.Document member variable, which I assume is of type string.

Long story short: you cannot do this, and need to reconsider your approach.

One way or another, this requirement of yours will require 400 MB of contiguous memory (given strings are char[] arrays of UTF-16 characters, weighing 2 bytes per character), which is nigh impossible to obtain in a 32 bit process, which the OutOfMemoryException that the runtime throws is trying to tell you.

Any "trick" you're going to find is going to have its drawbacks.

  • Reading the file in chunks doesn't matter, as the end result is the same: the final string will still require 400 MB of contiguous memory.
  • Setting the StringBuilder's Length to 0, found through Googling "C# StringBuilder OutOfMemoryException", will make the exception go away, but only cause the last chunk of the file to be assigned to document.Document. It's safe to say you don't want that as well.
  • Changing the project to run only on 64 bit machines might be an option, but most likely won't (hardware and other dependencies), and is a dirty workaround that hides the actual problem, and that is like applying a band-aid to a blown-off leg: you're ignoring the fatal design mistake. One day someone is going to feed your app a 2 GB file, and you're back to square one.

You need to stream the file, and process it line by line.

foreach (string line in File.ReadLines(filename))
{
    // process line
}
CodeCaster
  • 147,647
  • 23
  • 218
  • 272
  • by the way what if i still need everything in document.Document then? – Neel Mar 10 '16 at 12:01
  • thing is whole code is already written and i am just there to handle 200 MB file case :/ – Neel Mar 10 '16 at 12:03
  • Somebody gave me a 5 liter bucket and tells me to go get 200 liter of water with it. What do I tell them? – CodeCaster Mar 10 '16 at 12:04
  • My point is: whatever it is you're doing with `document.Document` must be changed to support streaming. You simply cannot assign 400 MB of data to it. – CodeCaster Mar 10 '16 at 12:09