3

While the question check if input is type of string has been closed two of the answers spiked a micro-optimization question in my mind: which of the below two solutions would perform better?

Reed Copsey provided a solution using Char.IsLetter:

string myString = "RandomStringOfLetters";
bool allLetters = myString.All( c => Char.IsLetter(c) );

Adapted solution using regex from Mark Byers:

string s = "RandomStringOfLetters";
bool allLetters = Regex.IsMatch(s, "^[a-z]+$", RegexOptions.IgnoreCase);

Not wanting to just ask the question of either Reed or Mark I thought I'd write a quick test to determine which performed better. Problem is I haven't done a lot of code optimization (I tend to put code readability above all else).

Other than taking a timestamp before and after the run of each, what are some other (better?) options of determining which solution runs faster?

Edit

I modified Martin's answer to work with Console.WriteLine(...) and ran it as a console application. Not sure exactly how LinqPad runs applications but the results were about the same:

41
178
Community
  • 1
  • 1
ahsteele
  • 26,243
  • 28
  • 134
  • 248
  • @David: Using a Stopwatch will provide access to the high performance timers in Windows, which will usually give you much more accurate results than a time stamp before/after. – Reed Copsey Jul 21 '10 at 16:05
  • 3
    Note that the two pieces of code are not equivalent: `Char.IsLetter` is Unicode-aware, while the regex only allows non-accented latin letters. For that reason alone, I'd go with `Char.IsLetter` unless there was a really compelling reason (read: requirement) not to. – Michael Madsen Jul 21 '10 at 16:08
  • 1
    Just an idea, but you might also want to test `!Regex.IsMatch(s, "[^a-z]", RegexOptions.IgnoreCase)`. Note though that this produces a different result for empty strings (it becomes vacuously true that an empty string is all letters). – Michael Petito Jul 21 '10 at 16:12
  • @Michael Madsen: You could still use a Regex approach with the `\w` character class or your own combination of Unicode categories. – Michael Petito Jul 21 '10 at 16:15
  • @Reed posted the results which are the same as what Martin got: http://stackoverflow.com/questions/3301288/how-to-test-what-method-implementation-runs-faster/3301357#3301357 – ahsteele Jul 21 '10 at 16:45
  • @Reed sorry for misspelling your name. :) – ahsteele Jul 21 '10 at 16:46

6 Answers6

7

You'll want to do this, measuring the runtimes using a Stopwatch. Also, here are a few very important things to keep in mind when profiling:

  1. Always run your test more than 1 time. The first time you run it, there will be overhead from the JIT, and the timings may be misleading. Running many times and taking the average is a good approach (I'll often run a test like this 100,000 times, for example.)
  2. Always run your test with a full Release build, outside of the Visual Studio hosting process. (By default, you can use Ctrl+F5 for this.) The Visual Studio host dramatically impacts timings.
Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • 1
    Yeah - it's very important to do both of those things, too - especially for "micro" timings (ie: timing operations that run quickly on their own) – Reed Copsey Jul 21 '10 at 16:05
  • 2
    To emphasize what Reed is saying. There is some error in measuring the time something takes. For example, if your stopwatch is only accurate to the millisecond but the operation takes nanoseconds, then you will have a very inaccurate measurement. To solve this, you need to start the Stopwatch, run the method in a loop many times (1000+ maybe) and then stop the Stopwatch. Then you can divide the total time by 1000 to get a time per method call. – AaronLS Jul 21 '10 at 16:20
2

you should check out System.Diagnostics.Stopwatch!

http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.aspx

You should run the thing many times in a loop to reduce timing errors and other uncontrollable factors.

Hope that helps.

Kieren Johnstone
  • 41,277
  • 16
  • 94
  • 144
2

I just put this together in LINQPad as an example of how I'd do it (hence the calls to Dump() - Replace with Console.WriteLine(...) if you aren't using this handy tool).

Looks like the LINQ way is slightly more than four times faster:

System.Diagnostics.Stopwatch stopwatch = new Stopwatch();

stopwatch.Start();
for (int i = 0; i < 100000; i++)
{
 string myString = "RandomStringOfLetters";
 bool allLetters = myString.All( c => Char.IsLetter(c) );
}
stopwatch.Stop();
stopwatch.ElapsedMilliseconds.Dump();

stopwatch.Reset();

stopwatch.Start();
for (int i = 0; i < 100000; i++)
{
 string s = "RandomStringOfLetters";
 bool allLetters = Regex.IsMatch(s, "^[a-z]+$", RegexOptions.IgnoreCase);
}
stopwatch.Stop();
stopwatch.ElapsedMilliseconds.Dump();

Output:

47 
196
Martin Harris
  • 28,277
  • 7
  • 90
  • 101
0

There is a System.Diagnostics.Stopwatch class that can be used.

Whatever code you test, run the test once to remove JIT costs and then run it again for the final timings. Most individual timing counts could be unrepresentative due to other factors on the PC - so run many iterations and then calculate the average runtime from that.

Adam Houldsworth
  • 63,413
  • 11
  • 150
  • 187
0

Use the System.Diagnostics.Stopwatch class.

Start the StopWatch and run several thousand iterations, stop it and check the total milliseconds that have elapsed

MaLio
  • 2,498
  • 16
  • 23
0

Steps to determine which is faster:-

  1. Get a collection of computers, a couple of hundred should do, AMD/Intel/other,32-bit/64-bit, ...

  2. Install every .NET framework you care about on each of them (in turn)

  3. Try each combination of optimization options for compilation (in turn)

  4. Use StopWatch to test a large run for each

  5. Monitor memory utilization for each as that may have a larger impact on the rest of your application. Saving a few cycles at the cost of increased memory consumption and more garbage collection activity is often a poor 'optimization'.

That might give you some idea about which is faster in practice, at least for current releases of the compiler. Repeat with each new release of the compiler.

Ian Mercer
  • 38,490
  • 8
  • 97
  • 133
  • 2
    For a huge project, this would probably be a good idea. For a non-huge project, this seems like an insanely expensive way of not really generating any large amount of value. – Arve Systad Jul 21 '10 at 16:15
  • I was actually trying to point out that this is in fact *almost never* a good idea. Optimizing micro-second level operations is rarely the right thing to do: there's nearly always better ways to spend your time to improve overall performance. I was trying to point out that (a) optimization is much harder than it looks and (b) optimizing one aspect of your code can easily impact other areas. I'll make it clearer next time when I'm using 'reductio ad absurdam' to make a point. – Ian Mercer Jul 21 '10 at 16:56