79

It seems like optimization is a lost art these days. Wasn't there a time when all programmers squeezed every ounce of efficiency from their code? Often doing so while walking five miles in the snow?

In the spirit of bringing back a lost art, what are some tips that you know of for simple (or perhaps complex) changes to optimize C#/.NET code? Since it's such a broad thing that depends on what one is trying to accomplish it'd help to provide context with your tip. For instance:

  • When concatenating many strings together use StringBuilder instead. See link at the bottom for caveats on this.
  • Use string.Compare to compare two strings instead of doing something like string1.ToLower() == string2.ToLower()

The general consensus so far seems to be measuring is key. This kind of misses the point: measuring doesn't tell you what's wrong, or what to do about it if you run into a bottleneck. I ran into the string concatenation bottleneck once and had no idea what to do about it, so these tips are useful.

My point for even posting this is to have a place for common bottlenecks and how they can be avoided before even running into them. It's not even necessarily about plug and play code that anyone should blindly follow, but more about gaining an understanding that performance should be thought about, at least somewhat, and that there's some common pitfalls to look out for.

I can see though that it might be useful to also know why a tip is useful and where it should be applied. For the StringBuilder tip I found the help I did long ago at here on Jon Skeet's site.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Bob
  • 7,851
  • 5
  • 36
  • 48
  • 10
    It's also important to walk to line between optimization and readability. – Ryan Elkins Mar 18 '10 at 22:09
  • 3
    The "handful of strings"; the *number* isn't the issue - it is whether they are in a single composite concatenation statement, or multiple statements. – Marc Gravell Mar 18 '10 at 22:22
  • 2
    StringBuilder is often slower than the + operator. The C# compiler automatically translates repeated + into the appropriate overload(s) of String.Concat. – Richard Berg Mar 18 '10 at 22:33
  • 2
    You're going to have to have a tough time fighting the CLR while it's runtime optimizing IL and you tried to do the same at compile time - tug of war. In the good old days you optimized the instructions for the machine and the machine dumbly ran them. – John K Mar 19 '10 at 01:31

11 Answers11

105

It seems like optimization is a lost art these days.

There was once a day when manufacture of, say, microscopes was practiced as an art. The optical principles were poorly understood. There was no standarization of parts. The tubes and gears and lenses had to be made by hand, by highly skilled workers.

These days microscopes are produced as an engineering discipline. The underlying principles of physics are extremely well understood, off-the-shelf parts are widely available, and microscope-building engineers can make informed choices as to how to best optimize their instrument to the tasks it is designed to perform.

That performance analysis is a "lost art" is a very, very good thing. That art was practiced as an art. Optimization should be approached for what it is: an engineering problem solvable through careful application of solid engineering principles.

I have been asked dozens of times over the years for my list of "tips and tricks" that people can use to optimize their vbscript / their jscript / their active server pages / their VB / their C# code. I always resist this. Emphasizing "tips and tricks" is exactly the wrong way to approach performance. That way leads to code which is hard to understand, hard to reason about, hard to maintain, that is typically not noticably faster than the corresponding straightforward code.

The right way to approach performance is to approach it as an engineering problem like any other problem:

  • Set meaningful, measurable, customer-focused goals.
  • Build test suites to test your performance against these goals under realistic but controlled and repeatable conditions.
  • If those suites show that you are not meeting your goals, use tools such as profilers to figure out why.
  • Optimize the heck out of what the profiler identifies as the worst-performing subsystem. Keep profiling on every change so that you clearly understand the performance impact of each.
  • Repeat until one of three things happens (1) you meet your goals and ship the software, (2) you revise your goals downwards to something you can achieve, or (3) your project is cancelled because you could not meet your goals.

This is the same as you'd solve any other engineering problem, like adding a feature -- set customer focused goals for the feature, track progress on making a solid implementation, fix problems as you find them through careful debugging analysis, keep iterating until you ship or fail. Performance is a feature.

Performance analysis on complex modern systems requires discipline and focus on solid engineering principles, not on a bag full of tricks that are narrowly applicable to trivial or unrealistic situations. I have never once solved a real-world performance problem through application of tips and tricks.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • 1
    Was going to write a similar screed, but yours is better. Bravo. – Richard Berg Mar 18 '10 at 22:42
  • 7
    There's just some cases where there's a known better way to accomplish the same task while being less of a hog with resources. I don't buy that it's perfectly fine to program however you want as long as you meet some goal and it seems to work out ok. Or that it's for the best to program, *then* run a profiler, and *then* go back and change problem areas. What is wrong with one having a good idea of what it takes to optimize certain bits of code before they even start? – Bob Mar 18 '10 at 22:55
  • 1
    Very nicely worded, Eric. I like your bulleted approach to setting goals, measuring current perf. against the goals, and repeat. – Reed Copsey Mar 18 '10 at 22:56
  • 5
    @Bob: There's nothing wrong with being smart about using resources. Where things go wrong is when people (1) spend a lot of time (=money) on micro-optimizations that do not make a difference, (2) write programs which are *wrong*, and (3) write programs which are unclear. What you should be optimizing for is first, correctness. Second, good coding style. Third, performance. Once the code is correct and elegant, it will be much easier to make it performant. – Eric Lippert Mar 18 '10 at 23:00
  • 3
    That's fine, but you'll notice I'm not saying one should not code for correctness first, or style second, or what have you. But, it's also true that sometimes (or maybe alot of times these days), programmers give no consideration to performance or optimization at all. Is just having 1 & 2 enough to make up for a total uncaring of 3? I can't see how it's a bad idea to pay some respects to optimization and to learn a thing or two about what it takes – Bob Mar 18 '10 at 23:06
  • @Fredrik: I think you might be overthinking this. :) – Eric Lippert Mar 18 '10 at 23:07
  • 7
    @Bob: I agree that some programmers do not care about performance. But I'm not following your point. A list of tips and tricks isn't going to suddenly turn them into people who care about performance. Supposing for the sake of argument that you *can* make people who are at present uninterested into people who are interested, a list of tips and tricks isn't going to help them attain good performance. You can apply tips and tricks to a body of code all day and never know if you're making any progress at all against your goals. You've got to have goals and measure your progress. – Eric Lippert Mar 18 '10 at 23:10
  • I agree that generic tips and tricks are often unhelpful, but I'll point out that tips and tricks that target specific situations can really save the day! Frequently, these are referred to as "hacks" or "kludges" instead of "tips". These often come into play when working around performance problems in "other people's code." – Jason Kresowaty Mar 18 '10 at 23:24
  • Take good coding practices for instance. There's alot of differing ideas and rules for readability and maintainability out there. Those things won't create people who care about such things, but that doesn't make such lists useless. The same with optimization tips. Some people will start to pay at least some attention to things they're doing inefficiently, and others will just plug and play, the same way they just use OOP rampantly as they don't really care or understand the idea behind it – Bob Mar 19 '10 at 00:31
  • The problem with general tips and tricks for optimization is that many of them simply aren't valid 100% of the time. Many aren't valid even 10% of the time. The only reasonable tips left to give are stated excellently by these answers: profile, find the hot spots, and understand how the language works. – Ron Warholic Mar 19 '10 at 02:34
  • 1
    It's interesting to follow this back-and-forth, when I think both sides are right. Tips & tricks are to a programmer like drugs to a doctor. Nobody likes a doctor who just prescribes but doesn't diagnose, but when the diagnosis indicates a particular drug is appropriate, the doctor is expected to know it and prescribe it. – Mike Dunlavey Mar 20 '10 at 16:02
  • *"Optimize the heck out of what the profiler identifies as the worst-performing subsystem"* - Isn't that exactly what he's asking how to do? – BlueRaja - Danny Pflughoeft Apr 30 '13 at 20:52
  • 1
    @BlueRaja-DannyPflughoeft: Then the OP should actually present *a identified slow subsystem* and ask for *specific advice* on how to fix it. As it stands this question is basically "how do I improve my program?" which is so vague as to be not answerable. – Eric Lippert Apr 30 '13 at 22:41
  • i dont think he will get a job as in coding large games, or in neural network programming or vision applications. there are areas of programming where time still counts, and every hack counts. In other areas you can create web pages where your site can serve 30 users, thats fine for you..but not for many others on this site who passed level of a scholar. – Peter Mar 31 '16 at 09:25
45

Get a good profiler.

Don't bother even trying to optimize C# (really, any code) without a good profiler. It actually helps dramatically to have both a sampling and a tracing profiler on hand.

Without a good profiler, you're likely to create false optimizations, and, most importantly, optimize routines that aren't a performance problem in the first place.

The first three steps to profiling should always be 1) Measure, 2) measure, and then 3) measure....

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
22

Optimization guidelines:

  1. Don't do it unless you need to
  2. Don't do it if it's cheaper to throw new hardware at the problem instead of a developer
  3. Don't do it unless you can measure the changes in a production-equivalent environment
  4. Don't do it unless you know how to use a CPU and a Memory profiler
  5. Don't do it if it's going to make your code unreadable or unmaintainable

As processors continue to get faster the main bottleneck in most applications isn't CPU, it's bandwidth: bandwidth to off-chip memory, bandwidth to disk and bandwidth to net.

Start at the far end: use YSlow to see why your web site is slow for end-users, then move back and fix you database accesses to be not too wide (columns) and not too deep (rows).

In the very rare cases where it's worth doing anything to optimize CPU usage be careful that you aren't negatively impacting memory usage: I've seen 'optimizations' where developers have tried to use memory to cache results to save CPU cycles. The net effect was to reduce the available memory to cache pages and database results which made the application run far slower! (See rule about measuring.)

I've also seen cases where a 'dumb' un-optimized algorithm has beaten a 'clever' optimized algorithm. Never underestimate how good compiler-writers and chip-designers have become at turning 'inefficient' looping code into super efficient code that can run entirely in on-chip memory with pipelining. Your 'clever' tree-based algorithm with an unwrapped inner loop counting backwards that you thought was 'efficient' can be beaten simply because it failed to stay in on-chip memory during execution. (See rule about measuring.)

Ian Mercer
  • 38,490
  • 8
  • 97
  • 133
  • 11
    Similarly, don't get obsessed with big-O analysis. The O(nm) naive string search algorithm is, for common business cases, thousands of times faster than the O(n+m) algorithms that preprocess the search strings looking for patterns. Naive string search matching the first character often compiles down to a single machine instruction which is blazingly fast on modern processors that make heavy use of optimistic memory caches. – Eric Lippert Mar 18 '10 at 23:04
16

When working with ORMs be aware of N+1 Selects.

List<Order> _orders = _repository.GetOrders(DateTime.Now);
foreach(var order in _orders)
{
    Print(order.Customer.Name);
}

If the customers are not eagerly loaded this could result in several round trips to the database.

Aaron
  • 1,031
  • 10
  • 23
13
  • Don't use magic numbers, use enumerations
  • Don't hard-code values
  • Use generics where possible since it's typesafe & avoids boxing & unboxing
  • Use an error handler where it's absolutely needed
  • Dispose, dispose, dispose. CLR wound't know how to close your database connections, so close them after use and dispose of unmanaged resources
  • Use common-sense!
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
SoftwareGeek
  • 15,234
  • 19
  • 61
  • 78
  • 15
    As much as I agree that they're good things to do, the first two things here have no impact on performance - just maintainability... – Reed Copsey Mar 18 '10 at 22:14
  • true but it's still an optimized code. – SoftwareGeek Mar 18 '10 at 22:18
  • 4
    Additionally, the 3rd (boxing) is rarely a genuine pinch-point; it is exaggerated as an issue; as are exceptions - not *usually* a problem. – Marc Gravell Mar 18 '10 at 22:18
  • 1
    "but it's still an optimized code" - that is a big claim; the only thing there I would expect to be a significant issue is "dispose"; and that is more likely to surface as exceptions (out of handles, etc), not performance degradation. – Marc Gravell Mar 18 '10 at 22:19
  • 1
    Actually, the finalizer pattern is quite bad if optimization is your goal. Objects with finalizers are automatically promoted to Gen-1 (or worse). Furthermore, forcing your finalizer code to run on the GC thread usually isn't optimal if there's anything remotely expensive on that Todo list. Bottom line: it's a feature aimed at convenience & correctness, not one intended for raw speed. Details: http://msdn.microsoft.com/en-us/magazine/bb985010.aspx – Richard Berg Mar 18 '10 at 22:30
  • I like last one `Use common-sense !`.. – Anant Dabhi Jul 27 '15 at 05:34
11

OK, I have got to throw in my favorite: If the task is long enough for human interaction, use a manual break in the debugger.

Vs. a profiler, this gives you a call stack and variable values you can use to really understand what's going on.

Do this 10-20 times and you get a good idea of what optimization might really make a difference.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Conrad Albrecht
  • 1,976
  • 2
  • 15
  • 19
  • 2
    ++ Amen. I've been doing that since before profilers existed. & your program DrawMusic looks awesome! – Mike Dunlavey Mar 20 '10 at 16:14
  • 6
    This is essentially what profilers do, except they do it better than you in about a thousand different ways (faster, more often, more accurate, etc). They also do give call-stacks. This is the poor-man's (and the old-man-who-is-afraid-to-learn-new-things) solution. – BlueRaja - Danny Pflughoeft May 01 '13 at 09:07
  • 1
    @BlueRaja-DannyPflughoeft: They deceive you. They tell you with great precision that there's nothing much to be done. The difference between this method and profilers is that in this method you can see things to speed up that cannot be teased out from simple statistics. Instead they take 1000s of samples when the information that can lead you to the problem is evident in the first 10 if you can actually see the raw samples. I'm sure you've seen [*this post*](http://stackoverflow.com/a/1779343/23771). – Mike Dunlavey Jul 04 '13 at 14:04
  • @BlueRaja-DannyPflughoeft: Look at results. What's the biggest speedup ratio you ever got using a profiler? – Mike Dunlavey Jul 04 '13 at 19:08
  • @MikeDunlavey: About two orders of magnitude. If you think profilers tell you there's nothing to be done, you need to learn how to read their results correctly (more often than not, you don't want to *speed up* the hotspots, you want to stop them from being called so much). If someone told me in an interview that they think the pause method is better than a sampling profile *(which does the same thing, but better in literally every way)*, I would not hire them. – BlueRaja - Danny Pflughoeft Jul 04 '13 at 20:15
  • 3
    @BlueRaja-DannyPflughoeft: I'm sure you wouldn't, and when you get to my age you will run into people like yourself. But let's leave that aside. [*Here's some source code*](http://sourceforge.net/projects/randompausedemo/) If you can speed it up by 3 orders of magnitude, without looking at how I did it, using any other method, you will have bragging rights :) – Mike Dunlavey Jul 04 '13 at 23:02
  • @BlueRaja-DannyPflughoeft: BTW, I'm strongly agreeing with you when you say "more often than not, you don't want to speed up the hotspots, you want to stop them from being called so much". If a particular function call site appears on >1 stack samples, and you don't really need it, you've got a Bingo. [*Here's the math that says how much you can save.*](http://scicomp.stackexchange.com/a/2719/1262) – Mike Dunlavey Jul 05 '13 at 18:52
  • @BlueRaja: There are several different kinds of profilers, most of which suffer serious flaws. The worst one is GPROF (IMHO) and the best one [*Zoom*](http://www.rotateright.com/), because it samples the stack on wall-clock time and reports line-level inclusive percent. (Even so, it doesn't beat pausing.) Presentation is everything - it doesn't fall into the traps of hot-path, call-graph, flame-graphs or recursion. There may be other profilers similar to Zoom - I don't know. In [*this video*](https://www.youtube.com/watch?v=xPg3sRpdW1U) at 5:18, it tries to answer the objections you raised. – Mike Dunlavey Aug 02 '14 at 21:15
10

If you identify a method as a bottleneck, but you don't know what to do about it, you are essentially stuck.

So I'll list a few things. All of these things are not silver bullets and you will still have to profile your code. I'm just making suggestions for things you could do and can sometimes help. Especially the first three are important.

  • Try solving the problem using just (or: mainly) low-level types or arrays of them.
  • Problems are often small - using a smart but complex algorithm does not always make you win, especially if the less-smart algorithm can be expressed in code that only uses (arrays of) low level types. Take for example InsertionSort vs MergeSort for n<=100 or Tarjan's Dominator finding algorithm vs using bitvectors to naively solve the data-flow form of the problem for n<=100. (the 100 is of course just to give you some idea - profile!)
  • Consider writing a special case that can be solved using just low-level types (often problem instances of size < 64), even if you have to keep the other code around for larger problem instances.
  • Learn bitwise arithmetic to help you with the two ideas above.
  • BitArray can be your friend, compared to Dictionary, or worse, List. But beware that the implementation is not optimal; You can write a faster version yourself. Instead of testing that your arguments are out of range etc., you can often structure your algorithm so that the index can not go out of range anyway - but you can not remove the check from the standard BitArray and it is not free.
  • As an example of what you can do with just arrays of low level types, the BitMatrix is a rather powerful structure that can be implemented as just an array of ulongs and you can even traverse it using an ulong as "front" because you can take the lowest order bit in constant time (compared with the Queue in Breadth First Search - but obviously the order is different and depends on the index of the items rather than purely the order in which you find them).
  • Division and modulo are really slow unless the right hand side is a constant.
  • Floating point math is not in general slower than integer math anymore (not "something you can do", but "something you can skip doing")
  • Branching is not free. If you can avoid it using a simple arithmetic (anything but division or modulo) you can sometimes gain some performance. Moving a branch to outside a loop is almost always a good idea.
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
harold
  • 61,398
  • 6
  • 86
  • 164
8

People have funny ideas about what actually matters. Stack Overflow is full of questions about, for example, is ++i more "performant" than i++. Here's an example of real performance tuning, and it's basically the same procedure for any language. If code is simply written a certain way "because it's faster", that's guessing.

Sure, you don't purposely write stupid code, but if guessing worked, there would be no need for profilers and profiling techniques.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
6

The truth is that there is no such thing as the perfect optimised code. You can, however, optimise for a specific portion of code, on a known system (or set of systems) on a known CPU type (and count), a known platform (Microsoft? Mono?), a known framework / BCL version, a known CLI version, a known compiler version (bugs, specification changes, tweaks), a known amount of total and available memory, a known assembly origin (GAC? disk? remote?), with known background system activity from other processes.

In the real world, use a profiler, and look at the important bits; usually the obvious things are anything involving I/O, anything involving threading (again, this changes hugely between versions), and anything involving loops and lookups, but you might be surprised at what "obviously bad" code isn't actually a problem, and what "obviously good" code is a huge culprit.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
5

Tell the compiler what to do, not how to do it. As an example, foreach (var item in list) is better than for (int i = 0; i < list.Count; i++) and m = list.Max(i => i.value); is better than list.Sort(i => i.value); m = list[list.Count - 1];.

By telling the system what you want to do it can figure out the best way to do it. LINQ is good because its results aren't computed until you need them. If you only ever use the first result, it doesn't have to compute the rest.

Ultimately (and this applies to all programming) minimize loops and minimize what you do in loops. Even more important is to minimize the number of loops inside your loops. What's the difference between an O(n) algorithm and an O(n^2) algorithm? The O(n^2) algorithm has a loop inside of a loop.

Gabe
  • 84,912
  • 12
  • 139
  • 238
  • ironacly LINQ adds extra sausage, and one should wonder if a solution without it exists. – Peter Mar 31 '16 at 09:27
2

I don't really try to optimize my code but at times I will go through and use something like reflector to put my programs back to source. It is interesting to then compare what I wrong with what the reflector will output. Sometimes I find that what I did in a more complicated form was simplified. May not optimize things but helps me to see simpler solutions to problems.

Aaron Havens
  • 313
  • 4
  • 7