0

I have a Windows C# BackgroundService process that runs and executes some method. I'd like to run this method utilizing multi-threading on the system to process as much as possible. The DoWork method contains a DB transaction and inside of that it performs some file I/O based on that transaction. I don't need the result of the tasks and only need to run them as quickly as possible.

Currently, I have it running with something like the below example. This works and I do see a performance improvement however this doesn't seem like the right way of doing this. I've also seen the ability to use Task.Run and then WaitAll but I saw similar performance as the Parallel.For. What I'd like to do is constantly utilize the available threads to perform DoWork. Am I doing this correctly, or is there a better way to do this?

Edit: Right now, everything is running synchronously except the initial call to DoWork.

protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
    _logger.LogInformation(
        "Consume Scoped Service Hosted Service running.");

    Parallel.For(0, 8, task => {
        await DoWork(stoppingToken);
    });

    await Task.Delay(Timeout.Infinite, stoppingToken);
}

private async Task DoWork(CancellationToken stoppingToken)
{
    _logger.LogInformation(
        "Consume Scoped Service Hosted Service is working.");

    using (var transactionScope = new TransactionScope())      
    {      
       try      
       {      
          using (SqlConnection connection = new SqlConnection(connectionString))      
          {      
                connection.Open();      
                // get initial query which gets the row with data needed
                // to perform the file I/O and build request
                var query = "SELECT TOP(1) FROM MyTable" +
                    " WITH (readpast, rowlock, updlock)";
                var row = connection.Query<MyType>(query);

                // based on query, perform file i/o
                var file = File.ReadAllText(row.FileName);
                var data = JsonSerializer.Deserialize<MyData>(file);

                var item = new MyObject() { Id = data.Id, Name = data.Name };
                
                // call a WCF web service using item object, the web service
                // has no asynchronous methods
                service.SendRequest(item);

                // update the transaction and then close
                var updateQuery = "UPDATE MyTable SET Status = @Status WHERE Id = @Id";
                connection.Execute<MyType>(updateQuery);
                transactionScope.Complete();                 
          }           
       }      
       catch(Exception ex)      
       {      
          // Log error      
          transactionScope.Dispose();      
       }      
    }    

}
Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Tenza
  • 586
  • 1
  • 3
  • 9
  • 2
    https://stackoverflow.com/questions/11564506/nesting-await-in-parallel-foreach (the only thing can be suggested as you decided not to show the code that actually matters - async DB and file I/O - make sure to re-read the [MRE] guidance if you decide to include that info into the post). – Alexei Levenkov Mar 08 '23 at 01:03
  • 2
    `Parallel.For` is not Task-aware, JIC. – Guru Stron Mar 08 '23 at 01:07
  • @AlexeiLevenkov I've updated the question with some more information giving some context on how it works, it currently runs all db and file i/o synchronously – Tenza Mar 08 '23 at 01:35
  • If you have a db table containing work to do, you'll need to coordinate the work so the different threads don't all try to grab the same item. There's also no looping going on, so each thread will process one item and then exit. – Stephen Cleary Mar 08 '23 at 02:30
  • @StephenCleary It already doesn't grab the same item, see the readpast, rowlock, and updlock hints. The functionality works as I want it to, I want to confirm whether or not there is a better way of running multiple of the task in parallel. Basically, I'd like multiple threads or tasks running to perform the work so I can process more files simultaneously. In terms of the looping, should one thread be processing multiple items? – Tenza Mar 08 '23 at 02:49
  • @Tenza sorry, wrong suggestions by me - since it is clear you have no asynchronous code it is perfectly fine to use your `DoWork` inside `Parallel.ForEach`. – Alexei Levenkov Mar 08 '23 at 03:23
  • @AlexeiLevenkov I'm currently utilizing `Parallel.For`, since I don't have a collection to iterate over using `Parallel.ForEach`. I'm guessing this is probably as best as I can get in terms of parallel processing performance? I noticed that both methods can have N number of iterations it executes in parallel, but depending on how many I run, the performance can be degraded. – Tenza Mar 08 '23 at 03:37
  • @AlexeiLevenkov It seems I have to add the `ParallelOptions.MaxDegreeOfParallelism` property which can be used to set the maximum number of tasks, specifically set to the processor count so that it doesn't inject too many threads. – Tenza Mar 08 '23 at 03:45
  • What .NET platform are you targeting? .NET 7? – Theodor Zoulias Mar 08 '23 at 10:48
  • @TheodorZoulias Yes .NET 7 – Tenza Mar 08 '23 at 14:11

1 Answers1

1

Since you are targeting the .NET 7, you could use the .NET 6 API Parallel.ForEachAsync:

ParallelOptions options = new()
{
    MaxDegreeOfParallelism = 2,
    CancellationToken = stoppingToken,
};

await Parallel.ForEachAsync(Enumerable.Range(0, 8), options, async (_, ct) =>
{
    await DoWork(ct);
});

You can experiment with the MaxDegreeOfParallelism option, until you find the optimal value for the task at hand. Start with a small value, and increase it gradually until the performance stops improving.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • Is there any way to improve performance in this kind of scenario beyond using the ```Parallel.For``` for parallel processing? Since it's all synchronous and I do have a network request it seems there's only so much benefit from processing in parallel. – Tenza Mar 09 '23 at 05:36
  • @Tenza it might be possible to optimize the database-related work. Either by doing more work with less queries (batching, bulk operations), or by improving the query execution plan (adding indexes etc). But this is beyond the scope of this question. – Theodor Zoulias Mar 09 '23 at 05:42