3

I have a C# program which acts as a multi-threaded web server. It does lots of processing of Xml structures held in a Sql Server database.

As the size of these Xml structures increases, I am finding the app running out of memory.

I have deployed the ANTS memory profiler to see what is happening, and have managed to reduce the number of large strings held in memory during processing, and improved things a bit.

However, I am now left with a fragmented large object heap, caused by large byte arrays held in the connection pool. The large byte arrays are

TdsParserStateObject._bTmp
in TdsParser._physicalStateObj
in SqlInternalConnectionIds._parser
in DbConnectionInternal[0]
in DbConnectionPool._objectList

I am 99.9% sure I am only using connections within using statements, although I do keep one connection per thread open while the thread is running (which is intended as an optimisation, but I'm suspicious whether it is making things worse).

Is there anything I can do to a connection to reduce the amount of memory it holds (other than Closing or Disposing of it)?

Or should I just always Close or Dispose every connection immediately on every use?

[Later - as per my comment] I refactored the code to use a new connection for every database access, and dispose of it afterwards (except for transactions, of course, where I use the same connection from start to end of the transaction, and dispose it with the transaction).

Even when the program is idle (i.e. has no connections in use), there are still connections in the connection pool taking up lareg amounts of memory, and causing fragmentation.

Why would a disposed connection hold 58MB of memory in the connection pool?

[Even Later] I have a solution which prevents the Sql Server connection pool from fragmenting the large heap - this is to detect which connections will probably have a huge buffer, and mark them for removal from the pool on disposal, using

SqlConnection.ClearPool(connection)

I currently do this in a rather hacky way, by subclassing DataReader and detecting if any of the fields returned are over 10MB in size.

Suggestions as to a better way of detecting which Connections have large buffers are welcome.

Note that I have reverted out the changes to open and dispose of a connection on every database access, as keeping track of which connection to use while inside a transaction (has to be the transaction one, obviously) was doing my head in.

The connection opened at the start of the thread and closed at the end is fine, as nearly all the threads are short lived (being a response to a single web request). The exceptions are batch processes that probably operate in the context of a single transaction anyway.

Nikki Locke
  • 2,759
  • 6
  • 29
  • 53
  • 3
    Ado.net does connection pooling, you should not have to keep a connection open. See http://msdn.microsoft.com/en-us/library/8xx3tyca%28v=vs.110%29.aspx – Matthew Nov 08 '13 at 20:55
  • 3
    Try, not keeping the one connection per thread open while the thread is running and note the difference in memory consumption and you will have your answer of whether or not that is the most contributing factor. The connection pool should alleviate the need to keep a connection per thread as well, unless you are using different credentials for certain threads versus others. – Karl Anderson Nov 08 '13 at 20:55
  • 1
    *which is intended as an optimisation, but I'm suspicious whether it is making things worse* I think there's a quote about this. – ta.speot.is Nov 10 '13 at 00:41
  • Well, I refactored the code to use a new connection for every database access, and dispose of it afterwards (except for transactions, of course, where I use the same connection from start to end of the transaction, and dispose it with the transaction). Even when the program is idle (i.e. has no connections in use), there are still connections in the connection pool taking up lareg amounts of memory. – Nikki Locke Nov 11 '13 at 16:53
  • 1
    The only way I have found to stop Sql Server connections fragmenting the large heap is to remove connections that may have a large buffer from the connection pool. See question edit. – Nikki Locke Nov 14 '13 at 19:37

2 Answers2

3

There is no need to keep a connection open when you're using ADO.NET. Connections will be pooled for you by default (unless you turned off pooling yourself) and ADO.NET will do all the management of the pool. You don't even have to worry about different user credentials, etc. as connections are pooled only if the connection parameters are identical. From the memory usage perspective disposing of connection can only be helpful.

More about this:

Community
  • 1
  • 1
Szymon
  • 42,577
  • 16
  • 96
  • 114
2

Szymon's answer is good, up to a point, so I have voted it up.

However, there are a number of issues his answer does not solve. Whenever very large data (in my case a single text field containing about 10 MB of Xml) is retrieved through a SqlConnection, the internal buffers of the connection seem to become very large.

If you have a program that uses many connections at the same time, with large data, these large buffers remain referenced by the connection pool long after you have Disposed of the SqlConnection involved, which leads to fragmentation of the large data heap.

The only way I have found to prevent this is to detect SqlConnections which have been used for large data, and remove them from the connection pool before disposing of them.

As can be seen by the later edits in my question, I have not found an elegant way to detect whether the SqlConnection internal buffer is large or not, just a very inelegant one, by subclassing DataReader and detecting if any of the fields returned are over 10MB in size!

Nikki Locke
  • 2,759
  • 6
  • 29
  • 53