I have implemented a small piece of asynchronous code and I face a weird behavior.
Basically what I want to do is run an initialization process for a set of multiple "clients" and I don't want to process them in a queue (some may take time to process, otherts may not). I just want them to be all finished in order to go to the next step. In order to avoid having too many processes running concurrently, I'm using a sempaphore (set temporarily to 2).
The problem I have is the execution seems to be done synchronously. Here a piece of the code (minified: no logging, try/catch, etc):
public void IntializeReportStructure(DateTimeOffset inReferenceDate)
{
List<Task> theTaskCollection = new List<Task>();
foreach (long theClientId in this.GetClientIdCollection())
{
Task theTask = this.InitializeClientAsync(theClientId, inReferenceDate.Year, inReferenceDate.Month);
theTaskCollection.Add(theTask);
}
Task.WaitAll(theTaskCollection.ToArray());
}
private async Task InitializeClientAsync(long inClientId, int inReferenceYear, int inReferenceMonth)
{
await this.Semaphore.WaitAsync();
await this.InitializeClientReportAsync(inClientId);
await this.InitializeClientSummaryAsync(inClientId, inReferenceYear, inReferenceMonth);
this.Semaphore.Release();
}
Here's the logs content:
Waiting for semaphore for client 41. Current count is 2. Entered semaphore for client 41. Current count is 1. Semaphore has been released for client 41. Current count is 2. Waiting for semaphore for client 12. Current count is 2. Entered semaphore for client 12. Current count is 1. Semaphore has been released for client 12. Current count is 2. Waiting for semaphore for client 2. Current count is 2. Entered semaphore for client 2. Current count is 1. Semaphore has been released for client 2. Current count is 2. Waiting for semaphore for client 261. Current count is 2. Entered semaphore for client 261. Current count is 1. Semaphore has been released for client 261. Current count is 2. Waiting for semaphore for client 1. Current count is 2. Entered semaphore for client 1. Current count is 1. Semaphore has been released for client 1. Current count is 2. Waiting for semaphore for client 6. Current count is 2. Entered semaphore for client 6. Current count is 1. Semaphore has been released for client 6. Current count is 2.
As you can see, each task is going to the semaphore step only when the previous one is finished. I said earlier the process was running asynchronously : within the loop (when adding the Task to the list), the task status is "RanToCompletion" and the value of IsCompleted is "true". I did not include the timestamp in the logs but we have some "client" that takes 15-20s to run and the process is "waiting" in the meantime.
Last detail, the two methods "InitializeClientReportAsync" and "InitializeClientSummaryAsync" are just doing async get data and async save data. Nothing weird over there.
The fun part is I am able to have an asynchronous result by adding "await Task.Delay(1);" right after the semaphore WaitAsync.
private async Task InitializeClientAsync(long inClientId, int inReferenceYear, int inReferenceMonth)
{
await this.Semaphore.WaitAsync();
await Task.Delay(1);
await this.InitializeClientReportAsync(inClientId);
await this.InitializeClientSummaryAsync(inClientId, inReferenceYear, inReferenceMonth);
this.Semaphore.Release();
}
And here's the log content :
Waiting for semaphore for client 41. Current count is 2. Waiting for semaphore for client 12. Current count is 1. Waiting for semaphore for client 2. Current count is 0. Waiting for semaphore for client 261. Current count is 0. Waiting for semaphore for client 1. Current count is 0. Waiting for semaphore for client 6. Current count is 0. Entered semaphore for client 12. Current count is 0. Entered semaphore for client 41. Current count is 0. Semaphore has been released for client 12. Current count is 0. Entered semaphore for client 2. Current count is 0. Semaphore has been released for client 41. Current count is 0. Entered semaphore for client 261. Current count is 0. Semaphore has been released for client 261. Current count is 0. Entered semaphore for client 1. Current count is 0. Semaphore has been released for client 1. Current count is 0. Entered semaphore for client 6. Current count is 0. Semaphore has been released for client 2. Current count is 1. Semaphore has been released for client 6. Current count is 2.
As you can see, processes are all going to the semaphore in the first place, before getting in. Aslo, we realize client "2" is the longest one to initialize. This is what I expect from the code in the first place.
I have no idea why a simple Task.Delay(1)
changes the behavior of my process. It doesn't make much sense. Maybe I am missing something obvious because I'm too focused.
Edit
As mentionned in the comment, the problem comes from "async" Entity Framework code. I considered it as granted.
I simplified the content of underlying methods for the sake of the example. "IDbContext" is just an interface we use on the DbContext class of Entity Framework. In our case, async methods are directly the ones from the initial class.
private async Task InitializeClientAsync(long inClientId, int inReferenceYear, int inReferenceMonth)
{
await this.Semaphore.WaitAsync();
await this.SomeTest();
this.Semaphore.Release();
}
private async Task SomeTest()
{
using (IDbContext theContext = this.DataAccessProvider.CreateDbContext())
{
List<X> theQueryResult = await theContext.Set<X>().ToListAsync<X>();
foreach (X theEntity in theQueryResult)
{
theEntity.SomeProperty = "someValue";
}
await theContext.SaveChangesAsync();
}
}
So here, even if I retrieve hundreds of MB of entities and update them (all using await and async methods), everything stays completely synchronous. The next task will enter the semaphore only when this one will complete.
I added a Task.Delay(1) before SaveChangesAsync(), and the next Task enters the semaphore before the first task finishes. It confirms that everything is synchronous around here (except Task.Delay). But I can't say why though...