Out of memory error archiving a log file

Question

I am having a problem with a console job that runs and creates a daily log file that I archive at midnight.

This creates a blank log file for the next day and an archived file with yesterdays date in the name and the contents of the old file for debugging issues I may have had and not known about until the day after.

However since I cranked up the BOT's job I have been hitting issues with System Out of Memory errors when I try and archive the file.

At first I was just not able to get an archived file at all then I worked out a way to get at least the last 100,000 lines which is not nearly enough.

I wrap everything in 3 try/catches

I/O
System out of memory
standard exception

However it's always the OutOfMemoryException that I get e.g

System.OutOfMemoryException Error: Exception of type 'System.OutOfMemoryException' was thrown.;

To give you an example of size 100,000 lines of log is about 11MB file

A standard full log file can be anything from 1/2 a GB to 2GB

What I need to know is this:

a) what size of a standard text file will throw an out of memory error when trying to use File.ReadAllText or a custom StreamReader function I call ReadFileString e.g

public static string ReadFileString(string path)
{
    // Use StreamReader to consume the entire text file.
using (StreamReader reader = new StreamReader(path))
{
    return reader.ReadToEnd();
    }
}

b) is it my computers memory (I have 16GB RAM - 8GB use at time of copying) or the objects I am using in C# that are failing with the opening and copying of files.

When archiving I first try with my custom ReadFileString function (see above), if that returns 0 bytes of content I try File.ReadAllText and then if that fails I try a custom function to get the last 100,000 lines, which is really not enough for debugging errors earlier in the day.

The log file starts at midnight when a new one is created and logs all day. I never used to have out of memory errors but since I have turned up the frequency of method calls the logging has expanded which means the file sizes have as well.

This is my custom function for getting the last 100,000 lines. I am wondering how many lines I could get without IT throwing an out of memory error and me not getting any contents of the last days log file at all.

What do people suggest for the maximum file size for various methods / memory needed to hold X lines, and what is the best method for obtaining as much of the log file as possible?

E.G some way of looping line by line until an exception is hit and then saving what I have.

This is my GetHundredThousandLines method and it logs to a very small debug file so I can see what errors happened during the archive process.

private bool GetHundredThousandLines(string logpath, string archivepath)
{
    bool success = false;

    int numberOfLines = 100000;


    if (!File.Exists(logpath))
    {
    this.LogDebug("GetHundredThousandLines - Cannot find path " + logpath + " to archive " + numberOfLines.ToString() + " lines");
    return false;
    }

    var queue = new Queue<string>(numberOfLines);

    using (FileStream fs = File.Open(logpath, FileMode.Open, FileAccess.Read, FileShare.Read))
    using (BufferedStream bs = new BufferedStream(fs))  // May not make much difference.
    using (StreamReader sr = new StreamReader(bs))
    {
    while (!sr.EndOfStream)
    {
        if (queue.Count == numberOfLines)
        {
        queue.Dequeue();
        }

        queue.Enqueue(sr.ReadLine() + "\r\n");
    }
    }

    // The queue now has our set of lines. So print to console, save to another file, etc.
    try
    {

    do
    {        
        File.AppendAllText(archivepath, queue.Dequeue(), Encoding.UTF8);
    } while (queue.Count > 0);


    }
    catch (IOException exception)
    {
    this.LogDebug("GetHundredThousandLines - I/O Error accessing daily log file with ReadFileString: " + exception.Message.ToString());
    }
    catch (System.OutOfMemoryException exception)
    {
    this.LogDebug("GetHundredThousandLines - Out of Memory Error accessing daily log file with ReadFileString: " + exception.Message.ToString());
    }
    catch (Exception exception)
    {
    this.LogDebug("GetHundredThousandLines - Exception accessing daily log file with ReadFileString: " + exception.Message.ToString());
    }


    if (File.Exists(archivepath))
    {
    this.LogDebug("GetHundredThousandLines - Log file exists at " + archivepath);
    success = true;
    }
    else
    {
    this.LogDebug("GetHundredThousandLines - Log file DOES NOT exist at " + archivepath);
    }

    return success;

}

Any help would be much appreciated.

Thanks

Why are you reading the log at all to archive it? Just rename it, copy it to the archive folder, create a new blank file for the next day, and voila! No out of memory, no unnecessary reading of an entire file when you don't need to, and no more headache. — Ken White, Apr 08 '16 at 16:58
Why not just rename your old file and create a new file for the following day? This seems like a lot off effort that could be spared - sorry, just saw ken white beat me to this one. I concur with him. — Ian Murray, Apr 08 '16 at 16:59
Why not name your log file with date and time. no renaming ever — Claudius, Apr 08 '16 at 17:00
If you are archiving the log files, you could zip them at the same time to save a lot of disk space. SharpZipLib allows you to [use a buffer](https://github.com/icsharpcode/SharpZipLib/wiki/Zip-Samples#anchorCreate) for creating the zip files, so you can limit the memory used to, say, 32KB. I expect other file compression utlilities have the same facility. — Andrew Morton, Apr 08 '16 at 17:20
Ok, yes but from my experience with ASP classic, a Move under the covers does the same as a File.Copy then File.Delete anyway so the contents are still copied out somehow. If there is not enough memory to do the copy of the file then it won't work. I am sure years ago I did attempt to use this simplier method but it kept failing due to memory issues which is why I went on to this 2 (now 3 stage) process. However I could return and see if FIle.Move would work again but I am sure it didn't which is why I had to progress to this in the 1st place. — MonkeyMagix, Apr 09 '16 at 08:24
Claudius, I like that idea, I have just always had a logcurrent.log file for the current day and then archived files with the date in the name which I then loop through and delete the X oldest keeping the newest 2 archived files. However having all of them with the same date in name format might be simpler - "Techies Law" - think it though with someone and the most complex (in your mind) problems can have the simplest of solutions. — MonkeyMagix, Apr 09 '16 at 08:31

score -1 · Accepted Answer · answered Apr 08 '16 at 18:08

-1

try: keep the queue and stream position in class scope, try GC.Collect() when getting out of memory exception and call function again. seek stream to last position and continue. or: use one database like sqlite and keep newest 100000 record in each table.

answered Apr 08 '16 at 18:08

mehrdad safa

1,081
9
10

Could you explain what GC.Collect does (I take it Garbage collecting), how does that help if I have no memory to open the file. Also I want a method to obtain the whole file NOT just 100,000 lines. So what would be the BEST way to obtain the whole file even if it was throwing out of memory errors? Looping through the file store 100,000 in a DB, clean memory (how? GC.Collect()?) and then keep going, then combine them all in chunks at the end? – MonkeyMagix Apr 09 '16 at 08:14
By the way last night it worked for some reason with my first method. The size of the file was 25828836 bytes. I am wondering if there is some KNOWN size where files are just TOO BIG to be opened and read by any method. If so what is this limit as I have enough RAM on my PC to open the file in an editor. So what are the limits of C# with each method of file reading? – MonkeyMagix Apr 09 '16 at 08:15
And also out of the 3 approaches I have listed here for reading a file including File.ReadAllText, The Stream function, and the Queue method which is the best for performance/memory/IO for extracting file contents? – MonkeyMagix Apr 09 '16 at 08:26
@MonkeyMagix Reading the whole file at once means you need to have enough *contiguous* virtual memory - that's not something you can affect or predict, really. Streaming is usually a better idea. Also note that .NET strings are unicode, so if you're reading an ASCII/ANSI file, double the file size for memory requirements. – Luaan Apr 11 '16 at 11:28
Just so you know I rewrote it to use the CopyTo method and it seems to work fine at the moment. So I have removed the GC.Collect() – MonkeyMagix Apr 15 '16 at 09:46

Out of memory error archiving a log file

1 Answers1

Linked