16

Edit: This boils down to why does changing just SqlConnection.Open() to await SqlConnection.OpenAsync() within asynchronous code result in strongly different behavior.

What's the difference between a SqlConnection.Open call in a synchronous code and an await SqlConnection.OpenAsync call in an asynchronous code aside from the obvious asynchronous behavior? Is the underlying connection made asynchronous with the database?

The documentation on OpenAsync is lite, https://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.openasync%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396.

An asynchronous version of Open, which opens a database connection with the settings specified by the ConnectionString. This method invokes the virtual method OpenAsync with CancellationToken.None.(Inherited from DbConnection.)

I find it interesting that previously the connection string required async=true within it, while in .net 4.5+ it's no longer required. Do the connections behave differently?

https://msdn.microsoft.com/en-us/library/hh211418(v=vs.110).aspx

Beginning in the .NET Framework 4.5, these methods no longer require Asynchronous Processing=true in the connection string.

When I happen to use the synchronous SqlConnection.Open within an asynchronous application and load it heavily I find that it performs very poorly, running the connection pool dry early. I expected opening the connection to be blocking, however, executing asynchronous commands (through dapper) on those connections behaves differently. So, what is OpenAsync doing differently?

EDIT:

As requested code to reproduce the issue (or perhaps demonstrate a difference). Running this case with Open() connection timeouts are encountered at around 180 concurrent async commands executing, with OpenAsync() no exceptions are encountered even at over 300 concurrent commands. You can push the concurrency to eventually get it to timeout, but it's definitely doing it much deeper into the concurrent commands.

using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using Dapper;
using Nito.AsyncEx;

namespace AsyncSqlConnectionTest
{
    class Program
    {
        public static int concurrent_counter = 0;
        public static int total_counter = 0;

        static void Main(string[] args)
        {


            var listToConsume = Enumerable.Range(1, 10000).ToList();
            Parallel.ForEach(listToConsume,
                new ParallelOptions { },
                value =>
                {
                    try
                    {

                        Task.Run(() => AsyncContext.Run(async () =>
                        {
                            using (var conn = new SqlConnection("Data Source=.; Database=master; Trusted_Connection=True;"))
                            {
                                Interlocked.Increment(ref concurrent_counter);
                                Interlocked.Increment(ref total_counter);
                                await conn.OpenAsync();
                                var result = await conn.QueryAsync("select * from master..spt_values; waitfor delay '00:00:05'");
                                Console.WriteLine($"#{total_counter}, concurrent: {concurrent_counter}");
                                Interlocked.Decrement(ref concurrent_counter);
                            }
                        })).GetAwaiter().GetResult();
                    }
                    catch (Exception e)
                    {
                        Console.Write(e.ToString());
                    }
                });
            Console.ReadLine();
        }
    }
}

EDIT 2:

Here's a test which finds the same differences using nothing but ADO.NET. It's worth noting that Dapper executes much faster, but that's not the point here. Again OpenAsync will eventually get a timeout, but much 'later' and never if the max degree of parallelism is 100 (below the connection pool size).

using System;
using System.Data.SqlClient;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;

namespace AsyncSqlConnectionTest
{
    class Program
    {
        public static int concurrent_counter = 0;
        public static int total_counter = 0;

        static void Main(string[] args)
        {
            var listToConsume = Enumerable.Range(1, 10000).ToList();
            Parallel.ForEach(listToConsume,
                new ParallelOptions { },
                value =>
                {
                    try
                    {

                        Task.Run(async () =>
                        {
                            using (var conn = new SqlConnection("Data Source=.; Database=master; Trusted_Connection=True;"))
                            {
                                Interlocked.Increment(ref concurrent_counter);
                                Interlocked.Increment(ref total_counter);

                                // this (no errors)
                                await conn.OpenAsync();

                                // vs. this (timeouts)
                                //conn.Open();

                                var cmd = new SqlCommand("select * from master..spt_values; waitfor delay '00:00:05'", conn);
                                using (var reader = await cmd.ExecuteReaderAsync())
                                {
                                    while (await reader.ReadAsync()) { }
                                }
                                Console.WriteLine($"#{total_counter}, concurrent: {concurrent_counter}");
                                Interlocked.Decrement(ref concurrent_counter);
                            }
                        }).GetAwaiter().GetResult();
                    }
                    catch (Exception e)
                    {
                        Console.Write(e.ToString());
                    }
                });
            Console.ReadLine();
        }
    }
}
ccook
  • 5,869
  • 6
  • 56
  • 81
  • 6
    Starting with .NET 4.5 internally it's always async IO. The sync version just blocks... – usr Nov 17 '16 at 18:52
  • So, is the unmanaged sql connection fundamentally different? edit - i realize you are saying it's the same. – ccook Nov 17 '16 at 18:53
  • 1
    `running the connection pool dry early` it should not behave differently. Whether the thread blocks on IO or on an event makes little difference. `executing asynchronous commands ... on those connections behaves differently` This might be the case, I don't know. `Asynchronous Processing=true` must have had some effect in previous .NET versions. I think the wrong question is to look at Open(Async) and the right one is to look at what `Asynchronous Processing=true` did. But according to this it should at most have had a small CPU cost impact: http://stackoverflow.com/a/7852617/122718 – usr Nov 17 '16 at 18:57
  • I agree, I didnt expect any difference. Asking it a different way, would sql server and the thread pool handling the two connections differently? – ccook Nov 17 '16 at 18:58
  • 1
    I believe not but you should test it in a micro benchmark. Spin up some workload, pause the debugger a few times and look at the Parallel Stacks window. It is very enlightening to watch internal call stacks of libraries to understand what code has impact on throughput. Starting with .NET 4.5 you should see all threads blocked on a task/event for sync APIs and almost no threads active in the async case. – usr Nov 17 '16 at 19:00
  • 2
    Is there a concrete issue you are having in your application? If not that's fine but if yes that might be easier to answer. – usr Nov 17 '16 at 19:01
  • Exactly what I did actually (the micro benchmark). I found that I ran out of (timeout on opening) available connections in the application pool (default size of 100) when I had about 35 concurrent async commands executing on connections executing on connections opened with Open. I would have expected it to be at 100. Changing it to use OpenAsync, nothing else, and it would maintain about 300 concurrent commands before failing to open connections. It seems like the connections themselves are async with OpenAsync, not just the opening of them. – ccook Nov 17 '16 at 19:02
  • Concrete issue, no, the issue is that we are having failures opening connections in production when under load which lead to the local benchmark and load testing. Note when I'm comparing the two mentioned above it's concurrent and asynchronous. – ccook Nov 17 '16 at 19:05
  • 1
    What you describe is *very* unexpected. The conn pool has little magic. Very straight forward. Likely, the benchmark was flawed. Care to post code? I have seen a lot of different issues over the years. Could be many things. – usr Nov 17 '16 at 19:06
  • I agree, and that would be good. To share I'll need to set it up with just the nuget packages and against something like northwind - which should be just find as the test was already very minimal but dubiously proprietary. Might need a couple of hours though to sneak that in. – ccook Nov 17 '16 at 19:09
  • 1
    You could just execute `waitfor delay 1 second` or `select * from master..spt_values`. A repro of this should fit into 10-30 lines if freshly written. My guess: You won't be able to repro and discover your bug in the process. – usr Nov 17 '16 at 19:11
  • Sure, give me a few minutes then :) – ccook Nov 17 '16 at 19:12
  • @usr, reproduced and included in an edit – ccook Nov 17 '16 at 19:37
  • 2
    The case interests me now. I do not see a good reason there should be any difference. An idea: If Open blocks internally and the thread pool is totally overloaded (here it clearly is) then blocking might take a long time because completing the task being blocked on might require TP resources for IO completion processing. Set the TP to 5000 min/max threads and set MaxDOP for Parallel to 500. The difference should disappear. – usr Nov 17 '16 at 19:51
  • @usr, i will give that a shot. It's worth noting though, when using OpenAsync instead of Open in our applications ORM, our load tests go from meager to being bottlenecked by the SQL server's available resources. So, whatever it is, it affects ASP.NET MVC as well, not just console applications. – ccook Nov 17 '16 at 20:02
  • 1
    Is it possible there is something in Dapper that would cause this that it does with the SqlConnection instance? I changed it a little and used ado.net directly (no dapper) and got up to 250 concurrent connections with Open spinning the default 10,000 number of times. Is there anything else special you are doing for the `Open` version or just replaced the 2 Async calls with the synchronous calls? – Igor Nov 17 '16 at 20:14
  • @Igor, definitely possible, trying an ADO.NET test without Dapper. – ccook Nov 17 '16 at 20:25
  • 1
    Actually not necessary. I can fit it in the comment: `conn.Open(); using (var command = new SqlCommand("waitfor delay '00:00:05'", conn)) command.ExecuteNonQuery();` And removed the entire `Task.Run(...` wrapper – Igor Nov 17 '16 at 20:31
  • @Igor, just added a second edit which reproduces the same difference with no external packages. – ccook Nov 17 '16 at 20:32
  • 1
    In your sync version are you also changing `ExecuteReaderAsync()` to `ExecuteReader()` and `ReadAsync` to `Read()` ? – Igor Nov 17 '16 at 20:37
  • @Igor, I am only changing from Open() to await OpenAsync(), nothing else. Note the code is always sync, just comparing that one change Open() vs OpenAsync() – ccook Nov 17 '16 at 20:38
  • 1
    Very odd, I can also duplicate it. The code for OpenAsync calls through to Open so *should* not be a difference. Have you tried what `usr` suggested? – Igor Nov 17 '16 at 21:25
  • @Igor, ill try his suggestion now. But that's exactly why I was confused, the documentation describes it as a simple wrapper, so I expect the same behavior. If it should be different in any way, I wouldnt have worried about it. – ccook Nov 17 '16 at 21:33
  • @usr it seemed at first that setting the max/min value on the thread pool does make quite a difference, but I think it's just in the ramp up behavior of Parallel.ForEach. Oddly, OpenAsync is still able to handle more concurrent threads before timing out. With a connection pool of 100, I can only finish with a max degree of parallelism of 100 safely with Open(), with OpenAsync() I can push it into 200 without issue. – ccook Nov 17 '16 at 22:18
  • 1
    Parallel is dumb, it does not ramp up. The TP ramps up. You *must* limit Parallel to far below the TP limits. Otherwise, internal ADO.NET TP usage might be crowded out. We want to eliminate TP overloading effects. Also, you probably should set the TP min value so that there is no ramp up. It just creates threads immediately up the the min setting. Let's take the ramp out of this investigation. Also note, that the `Task.Run` wrapping you are doing is doubling the number of threads required. That's why I suggested 500/5000 limit values. – usr Nov 17 '16 at 22:38
  • @usr thanks again. So I set the TP min and max to 1000 and tested to see where the timeouts occurred. With Open() it's at the max parallelism of 100 where, with OpenAsync() its over 200. I suspect that's due to nuances of tasks being handled? As far as performance, i think we are just seeing the improvement in using async on the connection so that it's no longer blocking. – ccook Nov 17 '16 at 23:06
  • Bad info here. Even in .NET 4.7.2, the underlying OpenAsync implementation is synchronous and thread blocking; I've looked at the decompiled code. This can be demonstrated by attempting to open 100+ simultaneous connections and watching them unable to get a pool thread to complete on, because it's blocking the ones it has. Setting MinThreadCount to something like 200 solves the problem and allows 100+ connections to be established instantly, whereas without setting MInThreadCount, they all start to fail with timeouts. For example: https://stackoverflow.com/a/57337499/88409 – Triynko Feb 22 '23 at 17:48

1 Answers1

4

Open() is a synchronous process which freezes the UI whereas OpenAsync() is an asynchronous process which opens the connection without freezing the UI

    public static async Task<SqlConnection> GetConnectionAsync()   
{  
      var con = new SqlConnection(ConnectionString);   
      if (con.State != ConnectionState.Open)   
          await con.OpenAsync();  
      return con;  
}  
  
public async Task<int> ExecuteQueryAsync(SqlConnection con, string query)  
{  
      if (con == null) con = await GetConnectionAsync();   
      var cmd = new SqlCommand(query, con);   
      return await cmd.ExecuteNonQueryAsync();   
}  
  • 2
    The condition `if (con.State != ConnectionState.Open)` should be `if (con.State == ConnectionState.Closed)` because it is not a good idea to try open next connection when current one is for example in state Connecting. Source: [ConnectionState Enum](https://learn.microsoft.com/en-us/dotnet/api/system.data.connectionstate?view=net-5.0) – Dawid Wekwejt Jan 30 '22 at 22:40