0

I will get start saying I'm a junior in .NET, who recently got into Task based programming after realizing it could improve performance of an app I'm developing. Our environment is most legacy code, and I need to reuse a series of its functions.

The app consists in deserializing a list of complex objects into a large single block of text, send the string via http, perform calculations, write the response in T-SQL database and output a bunch of large data reports. Every object in the list has only one of these big strings, and they are not dependent of each other. Based on this behavior, I started to think in a way I could improve the code using asynchronous programming.

I've been reading a lot in past week, starting with official documentation in Microsoft, the breakfast problem, async/await, TAP documentation, Stephen Cleary guidelines and other common articles from MVP people. After the research, most of the part became pretty clear to me, the keywords, threads and context, blocking, state machine, async != parallel, task modelling and other things, but still, I have this big doubt: What is the correct way to deal with synchronous code (both I/O and CPU bound) inside async methods in C#?

Bellow I'm providing example of the problem introduced before:

This is a button in a non-blocking UI context

    private async void btnConsultar_Click(object sender, EventArgs e) {
        var list = new List<ClassObject>(data);
        await Sincronizar(list, progress);
    }

After the call, the UI context should not be blocked and start executing tasks in parallel

public async Task Sincronizar(List<ClassObject> list, IProgress<int> progress) {
        var tasks = new List<Tasks>();
        foreach (ClassObject input in list) {
            \\Here is the first sync function I implement,
            \\which calculates the percentage of the progress
            \\It is a simple math function, not covering completion,
            \\with low resource needs
            \\progCalc(progress, list.Count);
            tasks.Add(ProcessObject(input));
        }
        await Task.WhenAll(tasks);
    }

So, progCalc is a primary type basic calculation method, which in my understand is CPU bound, but how should I declare it?

Declare void and leave as it is:

private void progCalc(..) --> progCalc(..)

Declare void and wrap in Task.Run:

private void progCalc(..) --> await Task.Run(() => progCalc(..)

Declare directly as a Task and await:

private Task progCalc(..) --> await progCalc(..)

Or maybe, even other approach? As far as I know, I should never declare a sync method as async and just wait the result at the end.

Same questions applies for the next method, which runs in parallel and process legacy code combined with database CRUD, I/O bound methods and large scale PDF file manipulation.

 public async Task ProcessObject(ClassObject obj) {

        \\Deserialize the object into a really big string, the formatter is synchronous
        string chunk = FormatStringLegacy(obj);

        \\Perform asynchronous http request
        \\Gets a simple response with integer status code and even bigger string
        HttpResponseClass response = await ExternalHttpRequestAsync(chunk);

        \\Create a different object iterating the text response, needs to be sync
        OtherComplexObject foo = CreateBar(foo)
        
        \\I/O bound operations that do not depend each other, but one is legacy
        Task dbTask = InsertIntoDatabaseAsync(foo)
        var file = CreateFileLegacy(foo, Extensions.PDF)

        Process.Start(file)
        await dbTask;

    }

There is a thought in my mind telling me that the legacy functions should just fire and forget, not waiting for the result and just jumping to the next line, since they're declared as synchronous but run inside an async block. I would like someone to explain to me what is happening with the synchronous methods under the hood. Thanks in advance.

  • 1
    All `async` methods have synchron methods, for instance `a++;` is a synchron method. Nothing wrong with that. If the task is minor then just create a synchron method. – Poul Bak Jun 14 '23 at 15:16

1 Answers1

3

For "simple" synchronous methods, which can complete in a reasonable amount of time, you're typically better off just calling them inline. There's no benefit to making them return a Task. But I'd question whether they should truly be void. Typically if you're making a calculation the function should be functional: taking its inputs as arguments and returning its outputs as return values, rather than having side-effects.

For more time-consuming synchronous methods, I'd still leave the method signature itself synchronous, but if you have a need to respond to the user quickly (e.g. you're on a UI thread) you can use Task.Run() to run those on a separate thread.

I'd also recommend double-checking whether the time-consuming synchronous methods are really synchronous. For example, there are ways to asynchronously await a Started Process. And most I/O-bound operations now have an Async version, even if your legacy code may be calling the synchronous version of those methods.

However, as a general rule, anything asynchronous should be awaited at some level: true fire-and-forget is dangerous, especially when errors might occur. If you don't need the UI or Requester to wait for that task to complete, you can move it to some kind of background queue that just makes sure all the tasks get awaited and any errors thrown are logged.

StriplingWarrior
  • 151,543
  • 27
  • 246
  • 315
  • 1
    Thank you for the answer. Just to reply, I think the only reason why our I/O database operations aren't using async versions yet, is because they are implemented inside a legacy type library which interface are still used in old frameworks and have lots of internal validation. Pretty sure there are colleagues working in detach some parts of it and use the newer technologies. – joaocarlosib Jun 14 '23 at 16:44