0

I would like to use the Parallel.Foreach mechanism to ensure full utilisation of the CPU for a CPU-intensive task. I am querying a significant amount of objects from a database one at a time (only one object within each iteration, each object reasonably small), and then performing a significant amount of CPU-based operations on this object, after which I save it back to the database.

I am using Entity Framework on the Data Model side, and given the amount of objects that I query I create a new Context for every iteration (this is to limit memory consumption):

    foreach (var id in idlist)
    {
        using (var ctx = new Context())
        {
            var model = ctx.Models.First(x => x.Id == id);
            await model.GlobalRefresh(true); //CPU heavy operation.
            await model.SaveAsync(); //Additional CPU heavy operation.
            ctx.SaveChanges(); //Save the changes
        } //Dispose of the model and the context, to limit memory consumption
    }

This works well in the synchronous implementation, as after each iteration both the model queried from the database and the Entity Framework context is disposed. My memory consumption during this process is therefore almost constant, which is great. If I don't create the context in this fashion, I quickly run out of memory (500+ objects).

When I set the above up in parallel as follows, my memory consumption goes sky high, as it seems that the context for each iteration is not disposed before the next iteration continues (and I do see significantly better CPU utilisation as expected):

        Parallel.ForEach(idlist, async (id) =>
        {
            using (var ctx = new Context())
            {
                var model = ctx.Models.First(x => x.Id == id);
                await model.GlobalRefresh(true);
                await model.SaveAsync();
                ctx.SaveChanges();
            }
        });

This is not necessarily a problem from a memory viewpoint, as long as all model objects aren't loaded into memory at once (this is also effectively the whole point of the parallel loop, to load more than one at a time). However, is there some way that I can manage this process better, e.g. not creating additional tasks when memory consumption reaches e.g. 75%, to avoid the Out Of Memory exception?

User_FSharp1123
  • 1,061
  • 1
  • 8
  • 19
  • 1
    Look at this: https://msdn.microsoft.com/en-us/library/system.threading.tasks.paralleloptions.maxdegreeofparallelism%28v=vs.110%29.aspx you can tweak number of tasks that can run at once – Allan S. Hansen Mar 22 '16 at 07:44
  • What is the nature of the CPU intensive work? Have you ruled out performing all of this activity within the database itself? – Damien_The_Unbeliever Mar 22 '16 at 08:49
  • So I've tried tweaking the MaxDegreeOfParallelism option, setting this to 1 for testing purposes. With the synchronous loop, my memory consumption remains roughly constant at 600MB. However, with the exact same loop inside the Parallel construct, the memory still goes up to >1GB. It therefore seems like the GC isn't properly collecting objects, or the Parallel construct isn't disposing properly? Nature of the intensive work is modelling calculations with reasonably advanced math algorithms, won't really be SQL friendly. – User_FSharp1123 Mar 22 '16 at 12:35
  • Might be important to also point out that my memory consumption starts at 600MB before the parallel loop, goes up to >1GB during the loop execution, and then stays at the >1GB level after the parallel loop has completed (breakpoint right after loop), whilst the memory consumption during the synchronous loop starts at 600MB and stays the same during and after the loop (with the expected small fluctuations as each iteration is completed and GC collected). – User_FSharp1123 Mar 22 '16 at 12:46
  • The `Parallel.ForEach` doesn't understand async delegates. All the IDs in the `idlist` are processed concurrently, with not concurrency limitation, other than the limitation imposed by the `ThreadPool` availability. That's why you have memory-consumption-related problems with the above code. – Theodor Zoulias Mar 28 '22 at 19:09

1 Answers1

0

TL;DR: You can use MemoryFailPoint to check for sufficient memory before your next operation. Microsoft Docs has a nice example here.

I had a similar issue recently. I noticed some "Out of memory exceptions" when I was inspecting the logs for an older app. Turns out a developer introduced some Parallel programming to make the app faster. So maybe the app is faster, but it is consuming memory faster now and hitting the memory limit for 32-bit app which is around 2GB, and then having these out of memory exceptions.

Hasan
  • 154
  • 1
  • 7