I have a console app written using C#
on the top of Core .NET 2.2 framework.
My application allows me to trigger long-running admin jobs using Windows task scheduler.
One of the admin jobs makes a web-API call which download lots of files before it uploads them onto Azure Blob storage. Here are the logical steps that my code will need to performs to get the job done
- Call the remote API which response with Mime message where each message represents a file.
- Parse out the Mime messages and convert each message into a
MemoryStream
creating a collection of MemoryStream
Once I have a collection with multiple 1000+ MemoryStream
, I want to write each Stream
onto the Azure Blob Storage. Since the write to the remote storage is slow, I am hoping that I can execute each write iteration using its own process or thread. This will allow me to have potintially 1000+ thread running at the same time in parallel instead of having to wait the result of each writes operation. Each thread will be responsible for logging any errors that potentially occur during the write/upload process. Any logged errors will be dealt with using a different job so I don't have to worry about retrying.
My understanding is calling the code that writes/upload the stream asynchronously will do exactly that. In other words, I would say "there is a Stream
execute it and run for as long as it takes. I don't really care about the result as long as the task gets completed."
While testing, I found out that my understanding of calling async
is somewhat invalid. I was under the impression that when calling a method that is defined with async
will get executed in the background thread/worker until that process is completed. But, my understanding failed when I tested the code. My code showed me that without adding the keyword await
the async
code is never really executed. At the same time, when the keyword await
is added, the code will wait until the process finishes executing before it continues. In other words, adding await
for my need will defeat the purpose of calling the method asynchronously.
Here is a stripped down version of my code for the sake of explaining what I am trying to accomplish
public async Task Run()
{
// This gets populated after calling the web-API and parsing out the result
List<Stream> files = new List<MemoryStream>{.....};
foreach (Stream file in files)
{
// This code should get executed in the background without having to await the result
await Upload(file);
}
}
// This method is responsible of upload a stream to a storage and log error if any
private async Task Upload(Stream stream)
{
try
{
await Storage.Create(file, GetUniqueName());
}
catch(Exception e)
{
// Log any errors
}
}
From the above code, calling await Upload(file);
works and will upload the file as expected. However, since I am using await
when calling the Upload()
method, my loop will NOT jump to the next iteration until the upload code finishes. At the same time, removing the await
keyword, the loop does not await the upload process, but the Stream never actually writes to the storage as if I never called the code.
How can I execute multiple Upload
method in parallel so that I have one thread running per upload in the background?