1

I've been struggling for 2 days now to find an appropriate solution for my simple scenario.

Scenario I want to accomplish
Doing multiple database access in a .Net Core Parallel loop that will insert multiple items in the same database.

var actions = new List<Action>();

actions.Add(() => { new DbContext.Set<TEntity>().Add(entity); });
actions.Add(() => { new DbContext.Set<TEntity>().Add(entity); });
actions.Add(() => { new DbContext.Set<TEntity>().Add(entity); });
actions.Add(() => { new DbContext.Set<TEntity>().Add(entity); });

Parallel.ForEach(actions, new ParallelOptions { MaxDegreeOfParallelism = 2 },
action =>
{
     action();
});

Known limitations

  1. EF Core's DBContext are not thread safe (we need to recreate them in each trhead)
  2. MySQL server does not accept 2 connections made withing the same transaction
  3. It seems we cannot pass an already opened connection to the DBContext
  4. EF Core opens and closes a connection to the DB internally each time we interact with it

Exception

Multiple simultaneous connections or connections with different connection strings inside the same transaction are not currently supported.

at MySql.Data.MySqlClient.MySqlConnection.Open() at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenDbConnection(Boolean errorsExpected) at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.Open(Boolean errorsExpected) at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable1.Enumerator.BufferlessMoveNext(DbContext _, Boolean buffer) at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable1.Enumerator.MoveNext() at Microsoft.EntityFrameworkCore.Query.Internal.LinqOperatorProvider._TrackEntities[TOut,TIn](IEnumerable1 results, QueryContext queryContext, IList1 entityTrackingInfos, IList1 entityAccessors)+MoveNext() at Microsoft.EntityFrameworkCore.Query.Internal.LinqOperatorProvider.ExceptionInterceptor1.EnumeratorExceptionInterceptor.MoveNext() at System.Collections.Generic.List1.AddEnumerable(IEnumerable1 enumerable) at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source) at CloseTheMonth.Backend.Data.Repositories.AccountUserRightRepository.ListAccounts(Guid userId) in C:\Work\GitHub\CloseTheMonth\Backend\CloseTheMonth.Data\Repositories\AccountUserRightRepository.cs:line 44 at CloseTheMonth.Backend.Services.AccountUserRightService.ListAccounts(Guid userId) in C:\Work\GitHub\CloseTheMonth\Backend\CloseTheMonth.Services\AccountUserRightService.cs:line 53 at CloseTheMonth.Backend.Controllers.AppController.Init(String authorization, AppInitRequest request) in C:\Work\GitHub\CloseTheMonth\Backend\CloseTheMonth.Backend\Controllers\AppController.cs:line 101

Reflections...
If I could open a connection globally, like this, and pass it to my DBContexts, that would do the trick. But I checked the EF Core and Pomelo source code and I didn't find a way to achieve something like this.

Maybe some other EF Core MySQL drivers beside Pomelo can do this ?

var actions = new List<Action>();

using (var conn = new MySqlConnection())
{
   actions.Add(() => { new DbContext(conn).Set<TEntity>().Add(entity); });
   actions.Add(() => { new DbContext(conn).Set<TEntity>().Add(entity); });
   actions.Add(() => { new DbContext(conn).Set<TEntity>().Add(entity); });
   actions.Add(() => { new DbContext(conn).Set<TEntity>().Add(entity); });

   Parallel.ForEach(actions, new ParallelOptions { MaxDegreeOfParallelism = 2 },
   action =>
   {
        action();
   });
}

I'm using MySQL server (8.0.22) with Pomelo.EntityFrameworkCore.MySql (2.1.4)

atiyar
  • 7,762
  • 6
  • 34
  • 75
  • Can't you add the itens to a ConcurrentBag and insert them all at once after the Parallel loop? – Magnetron Mar 01 '21 at 17:24
  • 2
    You don't need multithreading in data access to begin with. You can't perform multiple actions on the same connection concurrently so it's pointless to use multiple DbContexts. You need multiple *connections*. Which *won't* make slow or bad queries run faster anyway. – Panagiotis Kanavos Mar 01 '21 at 17:27
  • What are you trying to do anyway? What you wrote makes little sense. `Add `doesn't make any changes in the database. All changes are cached until `SaveChanges` is called, which saves all changes in a single transaction. Insertions aren't slow, so if you have issues with just 4 items, something is wrong with the database - perhaps a missing index? Or triggers/check constraints that have to scan the entire table each time? – Panagiotis Kanavos Mar 01 '21 at 17:29
  • Does this answer your question? [C# Data Connections Best Practice?](https://stackoverflow.com/questions/17552829/c-sharp-data-connections-best-practice) – Charlieface Mar 01 '21 at 17:29
  • What are you trying to do anyway? What you wrote makes little sense. `Add `doesn't make any changes in the database. All changes are cached until `SaveChanges` is called, which saves all changes in a single transaction. Insertions aren't slow, so if you have issues with just 4 items, something is wrong with the database - perhaps a missing index? Or triggers/check constraints that have to scan the entire table each time? – Panagiotis Kanavos Mar 01 '21 at 17:29
  • `If I could open a connection globally,` that would be a major bug that would drastically reduce performance, as all locks taken by any query would persist until the application terminated. – Panagiotis Kanavos Mar 01 '21 at 17:31
  • If you have a lot of data to insert, don't use ORMs like EF Core at all. Use MySqlBulkCopy to load the data as fast as possible into a staging table, then upsert the target table – Panagiotis Kanavos Mar 01 '21 at 17:32
  • @PanagiotisKanavos, Locks are held until a transaction commits, not until the connection terminates. – Bill Karwin Mar 01 '21 at 17:32
  • @BillKarwin unless there's no explicit transaction, in which case they're held until the connection closes. That's why connections have to be short-lived. What the OP is trying to do can only harm scalability – Panagiotis Kanavos Mar 01 '21 at 17:38
  • Ok, so I missed that very important point basically which is that I cannot make multiple concurrent actions within the same transaction. Got it. For the example, it was just an example, I don't have performance issue with 4 inserts, but I have inserting thousands of rows. In this particular scenario I can rewrite it to avoid EF Core to speed up the process (it's only in a couple of places in our app where we generate large schedules). – Samuel Rousseau-Boissonneault Mar 01 '21 at 17:54
  • BTW, if you would really want to use a global connection for multiple `DbContext` objects, you could. The `UseMySql()` extension method has an overload that accepts a `DbConnection` object. If you open the connection by yourself, then it will not be closed by EF Core when it disposes the context. – lauxjpn Mar 02 '21 at 00:33

2 Answers2

2

You must use a separate MySQL connection in each thread, if the code is not thread-safe.

The MySQL protocol is stateful, so if part of a query-response cycle is interleaved with a different query-response cycle for a different query, the responses get all confused. You won't like the result.

The only sensible way to design multi-threaded code that uses a database is to have each thread open its own connection.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
0

Everything is clear now, let me resume that simple :

  1. Multithreading write operations with EF Core and MySQL are not possible
  2. Multithreading read operations with EF Core and MySQL are possible

First, write operations

  • DBContext are not thread safe, you need to have one context per thread
  • Each time you Context.SaveChanges, it opens and closes a connection
  • MySQL refuses to have multiple opened connections inside a transaction
  • You cannot execute multithreaded actions on the same DB connection

Since you can't multithread inserts/updates/deletes, you can certainly optimize it a bit by avoiding the call to SaveChanges every time you touch an entity and wait until just before you commit the transaction.

First, it will allow EF to stay in memory as long as you don't SaveChanges.

Second, if any optimization has to be done by EF Core to like, Bulk Insert or any thing that could possibly be done, it will be able to do it since you keep all the DB job for the end (so EF knows the quantity of job to be done).

Second, read operations

  • DBContext are not thread safe, you need to have one context per thread
  • You need to create a scope under each thread
  • Dependency Injection will create a new UnitOfWork for each threaded scope (since you defined it as Scoped in your Startup class)
  • You execute all your threads simultaneously and EF Core will handle the multiple connections

For the read operations, since I want the code to look clean, here's what I've done :

public class Multithreader : IDisposable
{
    private List<Action> _actions = new List<Action>();

    public Multithreader(int maxThreads)
    {
        this._maxThreads = maxThreads;
    }

    public void Enqueue(Action action)
    {
        this._actions.Add(action);
    }

    public void Dispose()
    {
        Parallel.ForEach(this._actions, new ParallelOptions { MaxDegreeOfParallelism = 8 },
        action =>
        {
            action();
        });
    }
}

I also created a helper function in my BaseController to get a Scoped Services class (which has the references to my Services) :

public class BaseController : ControllerBase
{
    private readonly IServiceProvider _serviceProvider;

    public BaseController(IServiceProvider serviceProvider)
    {
        this._serviceProvider = serviceProvider;
    }

    protected IServices GetScopedServices()
    {
        var scope = _serviceProvider.CreateScope();

        return scope.ServiceProvider.GetService<IServices>();
    }
}

Then I just Enqueue everything very neetly to get what I want :

using (var threader = new Multithreader())
{
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value1 = services.Accounts.GetValue1(); } });
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value2 = services.Accounts.GetValue2(); } });
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value3 = services.Accounts.GetValue3(); } });
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value4 = services.Accounts.GetValue4(); } });
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value5 = services.Accounts.GetValue5(); } });
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value6 = services.Accounts.GetValue6(); } });
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value7 = services.Accounts.GetValue7(); } });
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value8 = services.Accounts.GetValue8(); } });
    threader.Enqueue(() => { using (var services = this.GetScopedServices()) { entity.Value9 = services.Accounts.GetValue9(); } });
}
  • Couple of notes here: Officially, multithreaded access to the `DbContext` is not supported at all. That being said, so far, multiple read operations usually work anyway. Calling `DbContext.SaveChanges` does only open a connection, if it hasn't been opened yet. It does never close a connection. Connections are only closed when the context gets disposed (and only if EF Core was the one that opened it). Having a dedicated `DbContext` per thread is definitely the simplest way to achieve multiple concurrent write operations (if you really need them to be concurrent, which most people don't). – lauxjpn Mar 02 '21 at 00:39