4

I have a multi-threaded application that talks to SQL server via Linq to Sql. The app is running fine on a quad core (Intel I-7) machine when the number of threads is artificially kept at 8:

            Parallel.ForEach(allIds, 
                    new ParallelOptions { MaxDegreeOfParallelism = 8 },
                    x => DoTheWork(x));

When the number of threads is left to the system to decide:

                Parallel.ForEach(allIds, x => DoTheWork(x));

After running for a little while, I get the following exception:

Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.

There are only two patterns in my app for calling SQL:

first:

    using (var dc = new MyDataContext())
    {
        //do stuff
        dc.SafeSubmitChanges();
    }

second:

        using (var dc = new MyDataContext())
        {
            //do some other stuff
            DoStuff(dc);
        }

.....
    private void DoStuff(DataContext dc)
    {
       //do stuff
       dc.SafeSubmitChanges();
    }

I decided to throttle the calls by this form of logic:

public static class DataContextExtention
{
    public const int SQL_WAIT_PERIOD = 5000;
    public static void SafeSubmitChanges(this DataContext dc)
    {
        try
        {
            dc.SubmitChanges();
        }
        catch (Exception e)
        {
            if (e.Message ==
                "Timeout expired.  The timeout period elapsed prior to obtaining a connection from the pool.  This may have occurred because all pooled connections were in use and max pool size was reached.")
            {
                System.Data.SqlClient.SqlConnection.ClearAllPools();
                System.Threading.Thread.Sleep(SQL_WAIT_PERIOD);
                dc.SafeSubmitChanges();
            }
            else
            {
                throw;
            }
        }
    }
}

This made absolutely no difference. Once the app throws the first exception of this kind, all sorts of random places in the app (even lines of code that have nothing to do with SQL server) start throwing this exception.

Q1: Isn't religiously employing using statement supposed to guard against exactly this scenario?

Q2: What is wrong and how do I fix this?

Note: There are approx 250,000 ids. I also tested at MaxDegreeOfParallelism = 16 and I get the same exception.

Barka
  • 8,764
  • 15
  • 64
  • 91
  • 3
    I don't know the answer, but why the hell would you want to compare `e.Message` to a constant string instead of checking e's type? That's inacceptable, especially that exception can be thrown in any language. – Tarec May 14 '14 at 15:18
  • 1
    How many iterations are going in that parallel `foreach`? It is likely that for some reason you are disposing of the `DataContext` but the connection is not closing. See if you can put a `dc.Connection.Close()` in the `using` block. – Evan L May 14 '14 at 15:26
  • I think the max pool size for connections is 100. Try setting your `MaxDegreeOfParallelism` to 100, see if that works, and then try 101 and see if that fails. Even if your `DataContext`s are being properly disposed (and I think they are), once you've maxed out the connection pool, additional contexts have to wait for a connection to free up. Since there are relatively large wait times on database operations, it's quite possible that the `Parallel.ForEach` is spinning up more than 100 concurrent operations. – hatchet - done with SOverflow May 19 '14 at 18:42
  • @hatchet, it fails at MaxDegreeOfParallelism = 16. We are not getting even close to 100 before it fails – Barka May 19 '14 at 19:01
  • 2
    Exceptions at "all sorts of random places" in a multi-threaded application are almost always due to failure to properly synchronize access to resources shared between multiple threads. – Joe May 19 '14 at 19:19
  • Can you include your connection string (with dummy user and password if present)? Try adding `Min Pool Size = 16` and do your test with 16 degrees of parallelism. Also, check the possibility that Sql Server is limiting connections http://stackoverflow.com/questions/1499718/any-limit-of-sql-server-connection-count – hatchet - done with SOverflow May 19 '14 at 19:22
  • Can you provide the full stacktrace? – Maarten May 19 '14 at 19:26
  • O think this is about the connection pool for the database. Can you provide the settings for the database from your web/app config ? – Cosmin Vană May 20 '14 at 20:39
  • Try enabling ado.net tracing and post the results! http://www.developer.com/net/csharp/article.php/3723011/ADONET-Trace-Logging.htm – Jonas Jämtberg May 22 '14 at 14:08

4 Answers4

3

I suppose it depends on how many items there are in allIds. If Parallel.ForEach creates too many parallel concurrent tasks, it could be that each one tries to open connection to the database (in parallel) and thus exhausting connection pool and leaving it unable to provide connections to all concurrent tasks that are requesting new connections.

If satisfying the connection pool request takes longer than timeout, that error message would make sense. So when you set MaxDegreeOfParallelism = 8, you have no more than 8 concurrent tasks, and thus no more than 8 connections "checked out" from the pool. Before the task completes (and Parallel.ForEach now has an available slot to run new task) the connection is returned back to the pool, so that when Parallel.ForEach runs the next item, connection pool can satisfy the next request for the connection, and thus you don't experience the issue when you artificially limit concurrency.

EDIT 1

@hatched's suggestion above is on the right track - increase the pool size. However, there is a caveat. Your bottleneck likely isn't really in computing power, but in database activity. What I suspect (speculation, admittedly) is happening is that while talking to the database, the thread can't do much and goes blocked (or switches to another task). So thread pool sees that there are more tasks pending, but CPU is not utilized (because of outstanding IO operations), and thus decides to take on more tasks for the available CPU slack. This of course just saturates the bottleneck even more and back to square one. So even if you increase the connection pool size, you're likely to keep running into the wall until your pool size is as big as your task list. As such, you may actually want to have bounded parallelism such that it never exhausts thread pool (and fine tune by making thread pool larger / smaller depending on DB load, etc.).

One way to try to find out if the above is true is to see why connections are taking so long and not getting returned to the pool. I.e. analyze to see if there is db contention that is slowing all connections down. If so, more parallelization won't do you any good (in-fact, that would be making things worse).

LB2
  • 4,802
  • 19
  • 35
  • Yes but at MaxDegreeOfParallelism = 8, I am not utilizing the full compute capacity of the hardware. I need to throttle in a different way than that to not make my throttle be the bottleneck. That is why I added the thottle code above, but that did no good. The question remains. – Barka May 19 '14 at 19:00
  • @user277498 See `EDIT 1` in the above post. – LB2 May 19 '14 at 19:34
3

I was thinking the following might help, in my experience with Oracle the DB Connection Pool has caused me issues before. So I thought there may be similar issue with SQL Server connection pool. Sometimes knowing the default connection settings and seeing connection activity on the DB is good information.

If you are using Sql Server 8 the default SQL Connection Pool is 100. The default Timeout is 15 seconds. I would want to have the SQL Admin track how many connections your making while running the app and see if your putting load on the DB Server. Maybe add some performance counters as well. Since this looks like a SQL Server exception I would gets some metrics to see what is happening. You could also use intellitrace to help see DB Activity.

Intellitrace Link: http://www.dotnetcurry.com/showarticle.aspx?ID=943

Sql Server 2008 Connection Pool Link: http://msdn.microsoft.com/en-us/library/8xx3tyca(v=vs.110).aspx

Performance Counters Link: http://msdn.microsoft.com/en-us/library/ms254503(v=vs.110).aspx

Barry West
  • 530
  • 3
  • 11
2

I could be way off target here, but I wonder if the problem is not being caused as a side effect of this fact about connection pooling (Taken from here, emphasis mine):

When connection pooling is enabled, and if a timeout error or other login error occurs, an exception will be thrown and subsequent connection attempts will fail for the next five seconds, the "blocking period". If the application attempts to connect within the blocking period, the first exception will be thrown again. Subsequent failures after a blocking period ends will result in a new blocking periods that is twice as long as the previous blocking period, up to a maximum of one minute.

  • So in other words, it's not that you are running out of connections per se, it's that something is failing on one or more of the parallel operations, perhaps because the poor table is caving under the pressure of parallel writes - have you profiled what's happening database-side to see if there are any problems with contention on the table during the operation?

  • This could then cause other requests for connections to start to back up due to the "penalty" described above; hence the exceptions and once you start to get one, your SafeSubmit method can only ever make things worse because it keeps retrying an already banjaxed operation.

  • This explanation would also heavily support the idea that the real bottleneck here is the database and that maybe it's not a good idea to try to hammer a table with unbounded parallel IO; its better to measure and come up with a maximum DOP based on the characteristics of what the database can bear (which could well be different for different hardware)

Also, as regards your first question, using only guarantees that your DataContext object will be auto-magically Dispose()d when it goes out of scope, so it's not at all designed to protect in this scenario - all it is is syntactic sugar for

try
{
    var dc = new DataContext();
    //do stuff with dc
}
finally
{
    dc.Dispose();
}

and in this case that's not a guard against there being (too) many DataContexts currently trying to connect to the database at the same time.

Stephen Byrne
  • 7,400
  • 1
  • 31
  • 51
0

Are you sure you are not facing connection leaks? Please check out the accepted answer at this link

Moreover, do you have already set MultipleActiveResultSets = true ?

From MSDN:

When true, an application can maintain multiple active result sets (MARS). When false, an application must process or cancel all result sets from one batch before it can execute any other batch on that connection. Recognized values are true and false.

For more information, see Multiple Active Result Sets (MARS).

Community
  • 1
  • 1
Mauro Sampietro
  • 2,739
  • 1
  • 24
  • 50
  • thanks. Since I am exclusively using 'using statements', I should not have a connection leak. Right? I will turn on MARS and see if it makes a difference and will let you know. – Barka May 24 '14 at 16:41