11

I need confirmation/explanation from you pros/gurus with the following because my team is telling me "it doesn't matter" and it's fustrating me :)

Background: We have a SQL Server 2008 that is being used by our main MVC3 / .Net4 web app. We have about 200+ concurrent users at any given point. The server is being hit EXTREMELY hard (locks, timeouts, overall slowness) and I'm trying to apply things i learned throughout my career and at my last MS certification class. They are things we've all been drilled on ("close SQL connections STAT") and I'm trying to explain to my team that these 'little things", though not one alone makes a difference, adds up in the end.

I need to know if the following do have a performance impact or if it's just 'best practice'

1. Using "USING" keyword. Most of their code is like this:

public string SomeMethod(string x, string y) {
    SomethingDataContext dc = new SomethingDataContext();
    var x = dc.StoredProcedure(x, y);
}

While I'm trying to tell them that USING closes/frees up resources faster:

using (SomethingDataContext dc = new SomethingDataContext()) {
    var x = dc.StoredProcedure(x, y);
}

Their argument is that the GC does a good enough job cleaning up after the code is done executing, so USING doesn't have a huge impact. True or false and why?

2. Connection Pools

I always heard setting up connection pools can significantly speed up any website (at least .Net w/ MSSQL). I recommended we add the following to our connectionstrings in the web.config:

..."Pooling=True;Min Pool Size=3;Max Pool Size=100;Connection Timeout=10;"...

Their argument is that .Net/MSSQL already sets up the connection pools behind the scenes and is not necessary to put in our web.config. True or false? Why does every other site say pooling should be added for optimal performance if it's already setup?

3. Minimize # of calls to DB

The Role/Membership provider that comes with the default .Net MVC project is nice - it's handy and does most of the legwork for you. But these guys are making serious use of UsersInRoles() and use it freely like a global variable (it hits the DB everytime this method is called). I created a "user object" that loads all the roles upfront on every pageload (along with some other user stuff, such as GUIDs, etc) and then query this object for if the user has the Role.

Other parts of the website have FOR statements that loop over 200 times and do 20-30 sql queries on every pass = over 4,000 database calls. It somehow does this in a matter of seconds, but what I want to do is consolidate the 20-30 DB calls into one, so that it makes ONE call 200 times (each loop). But because SQL profiler says the query took "0 seconds", they're argument is it's so fast and small that the servers can handle these high number of DB queries.

My thinking is "yeah, these queries are running fast, but they're killing the overall SQL server's performance." Could this be a contributing factor? Am I worrying about nothing, or is this a (significant) contributing factor to the server's overall performance issues?

4. Other code optimizations

The first one that comes to mind is using StringBuilder vs a simple string variable. I understand why I should use StringBuilder (especially in loops), but they say it doesn't matter - even if they need to write 10k+ lines, their argument is that the performance gain doesn't matter.

So all-in-all, are all the things we learn and have drilled into us ("minimize scope!") just 'best practice' with no real performance gain or do they all contribute to a REAL/measurable performance loss?

EDIT*** Thanks guys for all your answers! I have a new (5th) question based on your answers: They in fact do not use "USING", so what does that mean is happening? If there is connection pooling happening automatically, is it tying up connections from the pool until the GC comes around? Is it possible each open connection to the SQL server is adding a little more burden to the server and slowing it down?

Based on your suggestions, I plan on doing some serious benchmarking/logging of connection times because I suspect that a) the server is slow, b) they aren't closing connections and c) Profiler is saying it ran in 0 seconds, the slowness might be coming from the connection.

I really appreciate your help guys. THanks again

tereško
  • 58,060
  • 25
  • 98
  • 150
Losbear
  • 3,255
  • 1
  • 32
  • 28
  • Not to say research is not important but gurus on SO do provide a lot of insight...studying what they say is very much research – Miserable Variable Feb 13 '13 at 22:37
  • @JakeWilson801 Documentation from MS is not always the best source. (Edit: or even a good source) – Mike Cole Feb 13 '13 at 22:38
  • Re using string builder instead of concatening strings - if you are concatening strings once (i.e. `var s = "a" + "b"`), then a string will be more efficient, what you need to remember though is that a string instance is not mutable, so in a loop e.g. `string s = "a"; for (int i = 1, i < 1000; i++) { s += "a";}` you create a new string instance for every loop, this will impact the memory allocation and affect overall performance (whether it is significant or not will depend on the number of concatenations). – GarethD Feb 13 '13 at 22:53
  • 3
    Have you used a profiler (C# or SQL) to find out where the performance bottlenecks are? If you showed your peers the stats of methods which use `UsersInRoles()` and concatenate string with `+` and proved that they were the cause of the bottle necks, they will be more likely to trust you. For all you know, maybe an inappropriate usage of `StringBuilder` is causing the slowness. **Measure, don't guess.** – Jesse Webb Feb 13 '13 at 23:05
  • 1
    Thanks guys - yes I have done research and I've been a developer for over 12 years - BUT - I'm self taught and lack some of the "language" that my team throws at me to counter their argument. I sometimes rely on SO gurus to give me some direction and sometimes to reinforce what i already thought, which is what I'm doing here :) For those that did offer constructive comments/help/guidance (*cough* not jakewilson *cough*), thanks and I'll leave this question open a little longer to see if i get any more answers. Thanks again! – Losbear Feb 13 '13 at 23:11

9 Answers9

5

Branch the code, make your changes & benchmark+profile it against the current codebase. Then you'll have some proof to back up your claims.

As for your questions, here goes:

  1. You should always manually dispose of classes which implement IDisposable, the GC won't actually call dispose however if the class also implements a finalizer then it will call the finalizer however in most implementations they only clean up unmanaged resources.

  2. It's true that the .NET framework already does connection pooling, I'm not sure what the defaults are but the connection string values would just be there to allow you to alter them.

  3. The execution time of the SQL statement is only part of the story, in SQL profiler all you will see is how long the database engine took to execute the query, what you're missing there is the time it takes the web server to connect to and receive the results from the database server so while the query may be quick, you can save on a lot of IO & network latency by batching queries.

  4. This one is a good one to do some profiling on to prove the extra memory used by concatenation over string builders.

Trevor Pilley
  • 16,156
  • 5
  • 44
  • 60
  • 1
    According to [this site](http://www.connectionstrings.com/articles/show/all-sql-server-connection-string-keywords), the defaults for connection pooling are 0 for min, 100 for max. – Jesse Webb Feb 13 '13 at 23:09
  • +1 This answer does the best job at addressing each question thoroughly. I also like the suggestion to branch and benchmark. – Jesse Webb Feb 13 '13 at 23:11
  • For # 1 - many of you have said the same thing - that the GC won't clean it up right away. So does that tie up one of the connection pools? could several hundred of these open connections slow the server down? For #3, you just gave me a HUGE arguing point. I'll attempt to isolate the connection time and add it up for every one of the 2000+ db calls per click and see what that comes out to - good idea thanks! – Losbear Feb 13 '13 at 23:17
  • Not only connection time should be considered (even if that operation is very costly) but also network round trips. How long does it take to communicate between your webserver and your DB server? Multiply that with the number of calls and you know how much you can benefit from reducing DB calls... – HansLindgren Nov 12 '14 at 12:20
4

Oye. For sure, you can't let GC close your database connections for you. GC might not happen for a LONG time...sometimes hours later. It doesn't happen right away as soon as a variable goes out of scope. Most people use the IDisposable using() { } syntax, which is great, but at the very least something, somewhere needs to be calling connection.Close()

Al W
  • 7,539
  • 1
  • 19
  • 40
3
  1. Objects that implement IDisposable and hold on inmanaged resources also implement a finilizer that will ensure that dispose is called during GC, the problem is when it is called, the gc can take a lot of time to do it and you migth need those resources before that. Using makes the call to the dispose as soon as you are done with it.

  2. You can modify the parameters of pooling in the webconfig but its on by default now, so if you leave the default parameters you are no gaining anything

  3. You not only have to think about how long it takes the query to execute but also the connection time between application server and database, even if its on the same computer it adds an overhead.

  4. StringBuilder wont affect performance in most web applications, it would only be important if you are concatenating 2 many times to the same string, but i think its a good idea to use it since its easier to read .

Luis Tellez
  • 2,785
  • 1
  • 20
  • 28
  • 1
    In addition: Objects that have a finalizer (and for which SuppressFinalize was't called) require two garbage collections until they are completely removed. – Andre Loker Feb 13 '13 at 22:38
  • In addition to my own comment: types that have a finalizer and properly implement the Disposable pattern will call SuppressFinalize in their Dispose function. By disposing such types the need for the second garbage collection is therefore removed. (This should relate my first comment better to the question) – Andre Loker Feb 13 '13 at 22:46
2

I think that you have two separate issues here.

  1. Performance of your code
  2. Performance of the SQL Server database

SQL Server

Do you have any monitoring in place for SQL Server? Do you know specifically what queries are being run that cause the deadlocks?

I would read this article on deadlocks and consider installing the brilliant Who is active to find out what is really going on in your SQL Server. You might also consider installing sp_Blitz by Brent Ozar. This should give you an excellent idea of what is going on in your database and give you the tools to fix that problem first.

Other code issues

I can't really comment on the other code issues off the top of my head. So I would look at SQL server first.

Remember

  1. Monitor
  2. Identify Problems
  3. Profile
  4. Fix
  5. Go to 1
Community
  • 1
  • 1
Stuart Blackler
  • 3,732
  • 5
  • 35
  • 60
  • Yea I've looked at SQL Profiler on the live server and tried to show them that "look, this ONE click made 2,000+ calls!" but their arguement is "but it did all those 2k calls in 2 seconds". I'm trying to go above that and counter it with "those 2 seconds do add up", but i'm not 100% positive, because those 2seconds were mixed in with 200 other concurrent users. – Losbear Feb 13 '13 at 23:14
1

Well, I'm not a guru, but I do have a suggestion: if they say you're wrong, tell them, "Prove it! Write me a test! Show me that 4000 calls are just as fast as 200 calls and have the same impact on the server!"

Ditto the other things. If you're not in a position to make them prove you right, prove them wrong, with clear, well-documented tests that show that what you're saying is right.

If they're not open even to hard evidence, gathered from their own server, with code they can look at and inspect, then you may be wasting your time on that team.

Ann L.
  • 13,760
  • 5
  • 35
  • 66
  • I can't say "Prove it" (as much as i'd love to) because the application was written before I came on board. So now it's my burden to prove to them why they should go back and fix their code :) – Losbear Feb 13 '13 at 23:19
  • @Losbear That may be an uphill battle! But the direction you've said (in other comments) you plan to go in sounds like a good one. Test, and test again, and show them hard numbers. Best of luck to you! – Ann L. Feb 14 '13 at 18:04
1

At the risk of just repeating what others here have said, here's my 2c on the matter

Firstly, you should pick your battles carefully...I wouldn't go to war with your colleagues on all 4 points because as soon as you fail to prove one of them, it's over, and from their perspective they're right and you're wrong. Also bear in mind that no-one likes to be told their beatiful code is an ugly baby, so I assume you'll be diplomatic - don't say "this is slow", say "I found a way to make this even faster"....(of course your team could be perfectly reasonable so I'm basing that on my own experience as well:) So you need to pick one of the 4 areas above to tackle first.

My money is on #3. 1, 2 and 4 can make a difference, but in my own experience, not that much - but what you described in #3 sounds like death by a thousand papercuts for the poor old server! The queries probably execute fast because they're parameterised so they're cached, but you need to bear in mind that "0 seconds" in the profiler could be 900 milliseconds, if you see what I mean...add that up for many and things start getting slow; this could also be a primary source of the locks because if each of these nested queries is hitting the same table over and over, no matter how fast it runs, with the number of users you mentioned, it's certain you will have contention. Grab the SQL and run it in SSMS but include Client Statistics so you can see not only the execution time but also the amount of data being sent back to the client; that will give you a clearer picture of what sort of overhead in involved.

Really the only way you can prove any of this is to setup a test and measure as others have mentioned, but also be certain to also run some profiling on the server as well - locks, IO queues, etc, so that you can show that not only is your way faster, but that it places less load on the server.

To touch on your 5th question - I'm not sure, but I would guess that any SqlConnection that's not auto-disposed (via using) is counted as still "active" and is not available from the pool any more. That being said - the connection overhead is pretty low on the server unless the connection is actually doing anything - but you can again prove this by using the SQL Performance counters.

Best of luck with it, can't wait to find out how you get on.

Stephen Byrne
  • 7,400
  • 1
  • 31
  • 51
  • Thanks Stephen. Yes, I plan on being diplomatic about it, LOL :) I suspect #3 as well, but i guess I was wondering what sort of impact do alll those issues collectively would have - or is it the ONE issue that could be causing overall slowness. Thanks again for your 2c - i'll post my findings when I run the numbers tomorrow :) – Losbear Feb 14 '13 at 01:07
0

The using clause is just syntactic sugar, you are essentially doing

try
{
    resouce.DoStuff();
}
finally
{
     resource.Dispose()
}

Dispose is probably going to get called anyway when the object is garbage collected, but only if the framework programmers did a good job of implementing the disposable pattern. So the arguments against your colleagues here are:

i) if we get into the habit of utilizing using we make sure to free unmanaged resources because not all framework programmers are smart to implement the disposable pattern.

ii) yes, the GC will eventually clean that object, but it may take a while, depending on how old that object is. A gen 2 GC cleanup is done only once per second.

So on short:

  1. see above

  2. yes, pooling is set by default to true and max pool size to 100

  3. you are correct, definitely the best area to push on for improvements.

  4. premature optimization is the root of all evil. Get #1 and #3 in first. Use SQL profiler and db specific methods (add indexes, defragment them, monitor deadlocks etc.).

  5. yes, could be. best way is to measure it - look at the perf counter SQLServer: General Statistics – User Connections; here is an article describing how to do it.

Always measure your improvements, don't change the code without evidence!

Community
  • 1
  • 1
Bogdan Gavril MSFT
  • 20,615
  • 10
  • 53
  • 74
  • Agreed - the one thing my team and I agree on is neither of us want to put too much work into something that is not going to help that much; such as converting string variables to stringbuilder objects LOL. I think in the end, only hard numbers from profiler is going to convince these guys :) THanks Bogdan – Losbear Feb 13 '13 at 23:22
0

I recently was dealing with a bug in the interaction between our web application and our email provider. When an email was sent, a protocol error occurred. But not right away.

I was able to determine that the error only occurred when the SmtpClient instance was closed, which was occurring when the SmtpClient was disposed, which was only happening during garbage collection.

And I noticed that this often took two minutes after the "Send" button was clicked...

Needless to say, the code now properly implements using blocks for both the SmtpClient and MailMessage instances.

Just a word to the wise...

John Saunders
  • 160,644
  • 26
  • 247
  • 397
  • 2 minutes is better than i thought hehe - i thought the GC was looping every 5 or 10min (which is really bad). thanks for giving me an idea (i know it can be +or- several minutes) of how often GC goes through. – Losbear Feb 13 '13 at 23:24
0

1 has been addressed well above (I agree with it disposing nicely, however, and have found it to be a good practice).

2 is a bit of a hold-over from previous versions of ODBC wherein SQL Server connections were configured independently with regards to pooling. It used to be non-default; now it's default.

As to 3 and 4, 4 isn't going to affect your SQL Server's performance - StringBuilder might help speed the process within the UI, certainly, which may have the effect of closing off your SQL resources faster, but they won't reduce the load on the SQL Server.

3 sounds like the most logical place to concentrate, to me. I try to close off my database connections as quickly as possible, and to make the fewest calls possible. If you're using LINQ, pull everything into an IQueryable or something (list, array, whatever) so that you can manipulate it & build whatever UI structures you need, while releasing the connection prior to any of that hokum.

All of that said, it sounds like you need to spend some more quality time with the profiler. Rather than looking at the amount of time each execution took, look at the processor & memory usage. Just because they're fast doesn't mean they're not "hungry" executions.

David T. Macknet
  • 3,112
  • 3
  • 27
  • 36