0

I have a .NET Core C# console application that performs a large number of calculations and then writes the results to a SQL Server 2016 Developer edition database using Dapper (and Dapper.Contrib). The issue I'm having is that when I run a lot of items in parallel (greater than 1000, for example), I start getting intermittent connection failures on the .Open() call, saying

A network-related or instance-specific error occurred...

This often happens after several thousand rows have already been inserted successfully.

A simplified version of the code would look like the following:

        Parallel.ForEach(collection, (item) =>
        {
                var results = item.Calculate(parameters);

                dal.Results.Insert(results);

                allResults.AddRange(results);
        });

And inside the Insert method, it looks like this:

    public override void Insert(IEnumerable<Result> entities)
    {
        using (var connection = GetConnection())
        {
            connection.Open();

            using (var transaction = connection.BeginTransaction(IsolationLevel.ReadCommitted))
            {
                connection.Insert(entities, transaction);

                transaction.Commit();
            }
        }
    }

Some other things about the code that I don't think are affecting this but might be relevant:

  • dal.Results is simply a repository that contains that Insert() method and is preinitialized with a connection string that is used to instantiate a new SqlConnection(connectionString) every time GetConnection() is called.

  • allResults is a ConcurrentBag<Result> that I'm using to store all the results for later use outside the Parallel.ForEach

  • I'm using a transaction because it seems to perform better this way, but I'm open to suggestions if that could be causing problems.

Thanks in advance for any guidance on this issue!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Slope
  • 3
  • 3
  • Shouldn't you be creating one connection and reusing it ? saying that a transaction performs better doesn't sound right to me. I would suggest doing the calculation alone (In Parallel maybe) and then inserting all the records using Bulk-Insert. – Zein Makki Jan 21 '18 at 15:29
  • Spinning up threads for something that is essentially I/O-bound is a waste of a thread –  Jan 21 '18 at 15:29
  • @user3185569 Everything I've read says to create a new SqlConnection on every call to the DB and that .NET will handle connection pooling, etc. Is this not correct? – Slope Jan 21 '18 at 15:30
  • @Slope This is intended to not keep a single connection object throughout the life of your application or to hold to it when you are not using it. It doesn't mean that if you have 2 queries one after the other, then you create two connections. – Zein Makki Jan 21 '18 at 15:31
  • @MickyD There is a lot of other processing going on before this happens. The insert into the DB is the last thing it does, it just happens to be where I'm having issues, so I left the details of `.Calculate()` out since that is not relevant. Thanks for the comment, though. – Slope Jan 21 '18 at 15:32
  • 1
    @Slope Try moving the Db insertion of all the records outside the Parallel. I'm almost sure it is better and would solve your problem. – Zein Makki Jan 21 '18 at 15:33
  • @user3185569 Interesting, so if I create the connection and share it on all the threads, would that cause any issues? I can certainly try that out. I really appreciate you taking the time to comment. – Slope Jan 21 '18 at 15:34
  • @user3185569 I originally had it that way, but the insert of tens of thousands of rows was fairly slow and delaying the rest of the application from continuing. Would there be a way to kick that off asynchronously or something so the application can continue? Getting a bit out of scope of the original question but sounds like this might be a better solution. – Slope Jan 21 '18 at 15:36
  • Maybe the many parallel connections crash the SQL-Server, so moving the insert outside the Parallel could indeed solve your issue. – Nikolaus Jan 21 '18 at 15:36
  • 1
    @Slope So your problem is *How to bulk insert huge amount of records into SQL server table using C#* .. I don't think the solution for that is using Parallel. – Zein Makki Jan 21 '18 at 15:37
  • @user3185569 That is a fair point. ;) – Slope Jan 21 '18 at 15:38
  • @Slope A starting point: https://stackoverflow.com/questions/13722014/insert-2-million-rows-into-sql-server-quickly One last point, it is not C# that is slow, it is the database (Not your database, but databases in general). That's why using more threads doesn't solve your problem :) – Zein Makki Jan 21 '18 at 15:40
  • @user3185569 Thank you very much for your help. I was so stuck trying to solve THIS particular problem, I didn't stop to rethink my original assumption that this was even the right approach in the first place. – Slope Jan 21 '18 at 15:42
  • Do you try to use one connection lifetime for that many requests? I don't think re-create many connection to database is a best practice. Also the transaction is way to expensive to call for me for a micro transaction. – temmyraharjo Jan 21 '18 at 15:43
  • You might want to check out TPL DataFlow. There you can leave DB inserts to the last block or if you want to stick with Parallel.ForEach, then move DB inserts to be outside the block. Otherwise the benefits of parallelism are muted –  Jan 21 '18 at 16:54
  • Also, ditch the transaction. –  Jan 21 '18 at 16:55
  • Too many connection opened at the same time.... Try to send 100 request each time to see, I mean make 2 loops , the first level of loops is synchronous and will fire 100 requests in parallel, add thread.sleep(100) in each synchronous loop. – Anthony Giretti Jan 22 '18 at 02:14

1 Answers1

1

There is no advantage to execute heavily IO-bound db-operations in parallel.

You should create fever but bigger bunches of data to be inserted with minimun amount of database transactions. That can be achieve with several ways:

So try following: Execute CPU-intensive calculations in parallel loop and save allResults into database after loop.

Risto M
  • 2,919
  • 1
  • 14
  • 27