1

I wrote a software to process images and to reduce the processing time, I tried to use multithreading. Below is the relevant snippet.

bool Multithread = CheckMultithread();
UpdateParameters();

if (Multithread)
{
    Parallel.For(0, FileNames.Length ,i => Solve(FileNames[i]));
}
else
{
     foreach (string s in FileNames)
     {
         Solve(s);
     }
}

This is the first time I try to write multithreaded code in C#; but I believe there are no threading issues, since the processing of one image does not interfere with the processing of another.

The problem is: if the Multithread is true, I get an OutOfMemoryException when 200ish image is being processed... I imagine this kind of parallel implementation consumes N times more memory than the sequential equivalent, with N being the number of threads.

I'm using unmanaged code in a single class, but every time such class is used it is inside an using context. For reference, this class is a wrapper for the System.Drawing.Bitmap.

Each thread is consuming +/- 400 MB of ram, and when the OutOfMemoryException is thrown, the program is using around 1300 MB of ram. Even though I have over 9 GBs of free memory.

I wrote the following workaround code inside the Solve method. with this exception, I added the following code right in the begging of Solve()

if (GC.GetTotalMemory(false) > 1000*1000*1000)
{
    lock (Manager.dasLock)
    {
        Manager.sw.Start();
        GC.Collect();
        Manager.sw.Stop();
    }
}

With the workaround, the software was able to process all 2000+ images without running out of memory, but my peers are complaining that I shouldn't touch the GC. So, how can I fix this issue without invoking the GC manually?

Trauer
  • 1,981
  • 2
  • 18
  • 40
  • have you tried to save the processed images and disposing them manually after saving? AND the 1,3gb RAM cap is because of 32bit ;) – Sebastian L Sep 03 '14 at 12:10
  • 1
    Your assumption that Parallel.For creates a thread for each iteration is not true. It partitions the collection into several groups and each group is run in a thread. – Dirk Sep 03 '14 at 12:10
  • Just to be sure, because I find your last paragraphs confusing: if you call `GC.Collect` it causes an `OutOfMemoryException` but without it everything runs fine even with the `Parallel.For` loop? – Dirk Sep 03 '14 at 12:14
  • Is this a 32 bit app? If so you only have 2 GB address space regardless of how much RAM you have. Make it a 64 bit process and your address space grows A LOT. – Brian Rasmussen Sep 03 '14 at 15:25
  • 1
    Possibly relevant related question: [Parallel.ForEach can cause a “Out Of Memory” exception if working with a enumerable with a large object](http://stackoverflow.com/questions/6977218/parallel-foreach-can-cause-a-out-of-memory-exception-if-working-with-a-enumera) (But not necessarily a duplicate). Does the problem go away if you set the max degree of parallelism or create a custom practitioner that takes smaller chunks? – Scott Chamberlain Sep 03 '14 at 15:32
  • Show, don't tell. Can you please show the code for `Solve` and show how you are calling dispose? Also *"When the OutOfMemoryException is thrown, the program is using around 1300 mb of ram. Even tho I have over 9 GBs of free memory"*. If your app is 32 bit about 1300 mb of ram is the max per process you are going to get without some trickery if you are dealing with many large objects in memory. – Scott Chamberlain Sep 03 '14 at 17:28
  • Use Performance Monitor to monitor the .NET CLR Memory counters for your process. That might provide some insight into what is going on. – Martin Liversage Sep 03 '14 at 17:29
  • A 32 bit Windows process can address 4 GB of virtual memory but normally 2 GB is reserved by Windows so your own code and data is limited to the first 2 GB of virtual memory. And in this address space has to accommodate not only your data but also code and a stack for each thread. – Martin Liversage Sep 03 '14 at 18:32

2 Answers2

1

Limiting the thread count solved the issue:

if (Multithread)
{
    ParallelOptions pOptions = new ParallelOptions();
    pOptions.MaxDegreeOfParallelism = Environment.ProcessorCount;
    Parallel.For(0, FileNames.Length, pOptions, i => Solve(FileNames[i]));
}
else
{
    foreach (string s in FileNames)
    {
        Solve(s);
    }
}
Trauer
  • 1,981
  • 2
  • 18
  • 40
  • `Environment.ProcessorCount` is the default value of `MaxDegreeOfParallelism` option anyway, so I doubt that this is a solution. – Theodor Zoulias Dec 12 '19 at 09:49
  • 1
    @TheodorZoulias You sure about that? https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.paralleloptions.-ctor?view=netframework-4.8#System_Threading_Tasks_ParallelOptions__ctor – Trauer Dec 12 '19 at 12:09
  • 1
    Not any more. It seems that I confused `MaxDegreeOfParallelism` with PLINQ's `WithDegreeOfParallelism`. *PLINQ uses a fixed number of threads to execute a query; by default, it uses the number of logical cores in the machine. Conversely, Parallel.ForEach may use a variable number of threads, based on the ThreadPool’s support for injecting and retiring threads over time to best accommodate current workloads.* [Source](https://download.microsoft.com/download/3/4/D/34D13993-2132-4E04-AE48-53D3150057BD/Patterns_of_Parallel_Programming_CSharp.pdf) page 88. – Theodor Zoulias Dec 12 '19 at 12:30
0

Although not stated, if you are doing image processing there is a good chance some of the objects you are dealing with implement the IDisposable interface - such as File or Image objects.

For all those objects you should try to use "using" blocks (See here).

More than likely, when you are calling GC.Collect it is disposing of these objects which is why from what I understand when you call GC.Collect you do not get Exceptions

Zeus82
  • 6,065
  • 9
  • 53
  • 77
  • I know. But as I said I'm callling Dispose for each and every object like this. – Trauer Sep 03 '14 at 17:26
  • You say you are calling it at the end of Solve. Have you tried putting a breakpoint there to confirm you reach the last line? – Zeus82 Sep 03 '14 at 18:40