5

I'm trying to find out why my application gets slower over time. The application bundles the work to do into batches which can be executed concurrently. Each batch has to wait for the previous one to finish. The batches are executed every ~30ms inside the update function. In code, it would look something like this:

List<List<Action>> batches = new List<List<Action>>(); // class member

foreach (var batch in batches) // inside of the update function
{
    Parallel.ForEach(batch, action => action());
}

After a few days, I found that executing all batches takes longer and longer. At first, the execution takes about 15ms, and then after some days it takes more than 100ms. My goal is to find out why execution time keeps rising.

While debugging, I found that the amount of batches and the amount of action in each batch stays the same.

Profiling with dotTrace shows that sometimes one action in a batch takes longer and Parallel.ForEach will (and should) wait until all previous actions finish. When profiling right after starting the application, a delay is only visible every few hundred updates. After some days, a delay happens for almost every update.

This is how it looks in the profiler; I've marked the individual update cycles in different shades of green and numbered them. As you can see, there is an obvious delay (yellow) in the second update.

Thread overview

During the Yellow period, no other thread is doing work (at least according to the profiler). When selecting the threads one by one, all threads except one are not working on an action. As you can see in the next picture, the one thread that is still working on an action doesn't show any CPU load and all time is spent in ntdll.dll.

enter image description here

In this case, the call to GetEnumerator takes about 24ms, but I have seen freezes in functions like ToString() or ToArray() too.

Related stuff I have observed:

  • The freeze is not tied to a specific function and can happen in any part of the application
  • The time spent is always displayed in ntdll.dll
  • Usually happens right after a GC
  • All GCs running in the profiled time are gen0
  • The called function allocates memory
  • I think it might be useful to do some memory profiling to see if memory usage increase over time. It could be that the GC fail to release memory, so need to ask the OS for more. I would have expected a full collection in that case, but it might be worth checking. – JonasH Jun 25 '20 at 15:20
  • memory is not increasing overtime, makes sense that somehow it stops sporadically to request some memory, but after ~5 days the pattern on waiting on ntdll.dll is almost by every call to parallel.foreach – impoetk Jun 25 '20 at 16:59
  • I have the same problem. Did you find a solution to this problem? – saeid mohammad hashem Sep 02 '21 at 13:22
  • Same problem here, have you found anything? – CribAd Nov 17 '21 at 08:22

0 Answers0