I will get start saying I'm a junior in .NET, who recently got into Task based programming after realizing it could improve performance of an app I'm developing. Our environment is most legacy code, and I need to reuse a series of its functions.
The app consists in deserializing a list of complex objects into a large single block of text, send the string via http, perform calculations, write the response in T-SQL database and output a bunch of large data reports. Every object in the list has only one of these big strings, and they are not dependent of each other. Based on this behavior, I started to think in a way I could improve the code using asynchronous programming.
I've been reading a lot in past week, starting with official documentation in Microsoft, the breakfast problem, async/await, TAP documentation, Stephen Cleary guidelines and other common articles from MVP people. After the research, most of the part became pretty clear to me, the keywords, threads and context, blocking, state machine, async != parallel, task modelling and other things, but still, I have this big doubt: What is the correct way to deal with synchronous code (both I/O and CPU bound) inside async methods in C#?
Bellow I'm providing example of the problem introduced before:
This is a button in a non-blocking UI context
private async void btnConsultar_Click(object sender, EventArgs e) {
var list = new List<ClassObject>(data);
await Sincronizar(list, progress);
}
After the call, the UI context should not be blocked and start executing tasks in parallel
public async Task Sincronizar(List<ClassObject> list, IProgress<int> progress) {
var tasks = new List<Tasks>();
foreach (ClassObject input in list) {
\\Here is the first sync function I implement,
\\which calculates the percentage of the progress
\\It is a simple math function, not covering completion,
\\with low resource needs
\\progCalc(progress, list.Count);
tasks.Add(ProcessObject(input));
}
await Task.WhenAll(tasks);
}
So, progCalc is a primary type basic calculation method, which in my understand is CPU bound, but how should I declare it?
Declare void and leave as it is:
private void progCalc(..) --> progCalc(..)
Declare void and wrap in Task.Run:
private void progCalc(..) --> await Task.Run(() => progCalc(..)
Declare directly as a Task and await:
private Task progCalc(..) --> await progCalc(..)
Or maybe, even other approach? As far as I know, I should never declare a sync method as async and just wait the result at the end.
Same questions applies for the next method, which runs in parallel and process legacy code combined with database CRUD, I/O bound methods and large scale PDF file manipulation.
public async Task ProcessObject(ClassObject obj) {
\\Deserialize the object into a really big string, the formatter is synchronous
string chunk = FormatStringLegacy(obj);
\\Perform asynchronous http request
\\Gets a simple response with integer status code and even bigger string
HttpResponseClass response = await ExternalHttpRequestAsync(chunk);
\\Create a different object iterating the text response, needs to be sync
OtherComplexObject foo = CreateBar(foo)
\\I/O bound operations that do not depend each other, but one is legacy
Task dbTask = InsertIntoDatabaseAsync(foo)
var file = CreateFileLegacy(foo, Extensions.PDF)
Process.Start(file)
await dbTask;
}
There is a thought in my mind telling me that the legacy functions should just fire and forget, not waiting for the result and just jumping to the next line, since they're declared as synchronous but run inside an async block. I would like someone to explain to me what is happening with the synchronous methods under the hood. Thanks in advance.