25

ILSpy shows that String.IsNullOrEmpty is implemented in terms of String.Length. But then why is String.IsNullOrEmpty(s) faster than s.Length == 0?

For example, it's 5% faster in this benchmark:

var stopwatches = Enumerable.Range(0, 4).Select(_ => new Stopwatch()).ToArray();
var strings = "A,B,,C,DE,F,,G,H,,,,I,J,,K,L,MN,OP,Q,R,STU,V,W,X,Y,Z,".Split(',');
var testers = new Func<string, bool>[] { s => s == String.Empty, s => s.Length == 0, s => String.IsNullOrEmpty(s), s => s == "" };
int count = 0;
for (int i = 0; i < 10000; ++i) {
    stopwatches[i % 4].Start();
    for (int j = 0; j < 1000; ++j)
        count += strings.Count(testers[i % 4]);
    stopwatches[i % 4].Stop();
}

(Other benchmarks show similar results. This one minimized the effect of cruft running on my computer. Also, as an aside, the tests comparing to empty strings came out the same at about 13% slower than IsNullOrEmpty.)

Additionally, why is IsNullOrEmpty only faster on x86, whereas on x64 String.Length is about 9% faster?

Update: Test setup details: .NET 4.0 running on 64-bit Windows 7, Intel Core i5 processor, console project compiled with "Optimize code" enabled. However, "Suppress JIT optimization on module load" was also enabled (see accepted answer and comments).

With optimization fully enabled, Length is about 14% faster than IsNullOrEmpty with the delegate and other overhead removed, as in this test:

var strings = "A,B,,C,DE,F,,G,H,,,,I,J,,K,L,MN,OP,Q,R,,STU,V,,W,,X,,,Y,,Z,".Split(',');
int count = 0;
for (uint i = 0; i < 100000000; ++i)
    count += strings[i % 32].Length == 0 ? 1 : 0; // Replace Length test with String.IsNullOrEmpty
Edward Brey
  • 40,302
  • 20
  • 199
  • 253
  • 8
    Without knowing the exact situation you're in, I'm fairly confident that in most situations there are other optimisations to agonise over instead of the quickest way to check whether a string doesn't have any data in it. – Andrew Apr 28 '12 at 03:41
  • @minitech Because there are 4 testers, and he's timing them independently. – Adam Liss Apr 28 '12 at 03:46
  • How do `Empty` and `==` compare with `Length` and `IsNullOrEmpty`? – Adam Liss Apr 28 '12 at 03:47
  • Try changing s.Length == 0 to Convert.ToBoolean(s.Length) Does that increase performance? – JustinDanielson Apr 28 '12 at 04:01
  • 3
    @Andrew I agree that the difference isn't significant unless empty checks make up an unusually large percentage of execution time. I was just curious, and now I'm even more curious. – Edward Brey Apr 28 '12 at 04:32
  • @Adam Assuming you mean the tests for `== Empty` and `== ""`, the question refers to those as the "tests comparing empty strings". They were about 13% slower than `IsNullOrEmpty`. – Edward Brey Apr 28 '12 at 04:37
  • @JustinDanielson `Convert.ToBoolean(s.Length)` came out slower than `s.Length == 0`, almost exactly as slow as comparing to an empty string, interestingly. – Edward Brey Apr 28 '12 at 04:43
  • As will all microoptimizations, the error is almost certainly in the testing code in which you aren't properly benchmarking the code. Doing effective microbenchmarks are hard, you can quite easily get whatever results you want through code that appears at first glance to be a fair comparison. – Servy Apr 19 '13 at 13:49
  • @Servy: You're right. The error is commonly with the benchmarking code or (as it turned out in this case) the execution environment. However, there are also real micro-opt-snafus, sometimes where you least expect them. For example, in investigating this question, I stumbled upon a situation where [adding local variables makes .NET code slower](http://stackoverflow.com/q/10369421/145173), which seems to be a genuine compiler bug. – Edward Brey Apr 19 '13 at 14:15

7 Answers7

25

It's because you ran your benchmark from within Visual Studio which prevents JIT compiler from optimizing code. Without optimizations, this code is produced for String.IsNullOrEmpty

00000000   push        ebp 
00000001   mov         ebp,esp 
00000003   sub         esp,8 
00000006   mov         dword ptr [ebp-8],ecx 
00000009   cmp         dword ptr ds:[00153144h],0 
00000010   je          00000017 
00000012   call        64D85BDF 
00000017   mov         ecx,dword ptr [ebp-8] 
0000001a   call        63EF7C0C 
0000001f   mov         dword ptr [ebp-4],eax 
00000022   movzx       eax,byte ptr [ebp-4] 
00000026   mov         esp,ebp 
00000028   pop         ebp 
00000029   ret 

and now compare it to code produced for Length == 0

00000000   push   ebp 
00000001   mov    ebp,esp 
00000003   sub    esp,8 
00000006   mov    dword ptr [ebp-8],ecx 
00000009   cmp    dword ptr ds:[001E3144h],0 
00000010   je     00000017 
00000012   call   64C95BDF 
00000017   mov    ecx,dword ptr [ebp-8] 
0000001a   cmp    dword ptr [ecx],ecx 
0000001c   call   64EAA65B 
00000021   mov    dword ptr [ebp-4],eax 
00000024   cmp    dword ptr [ebp-4],0 
00000028   sete   al 
0000002b   movzx  eax,al 
0000002e   mov    esp,ebp 
00000030   pop    ebp 
00000031   ret 

You can see, that code for Length == 0 does everything that does code for String.IsNullOrEmpty, but additionally it tries something like foolishly convert boolean value (returned from length comparison) again to boolean and this makes it slower than String.IsNullOrEmpty.

If you compile program with optimizations enabled (Release mode) and run .exe file directly from Windows, code generated by JIT compiler is much better. For String.IsNullOrEmpty it is:

001f0650   push    ebp
001f0651   mov     ebp,esp
001f0653   test    ecx,ecx
001f0655   je      001f0663
001f0657   cmp     dword ptr [ecx+4],0
001f065b   sete    al
001f065e   movzx   eax,al
001f0661   jmp     001f0668
001f0663   mov     eax,1
001f0668   and     eax,0FFh
001f066d   pop     ebp
001f066e   ret

and for Length == 0:

001406f0   cmp     dword ptr [ecx+4],0
001406f4   sete    al
001406f7   movzx   eax,al
001406fa   ret

With this code, result are as expected, i.e. Length == 0 is slightly faster than String.IsNullOrEmpty.

It's also worth mentioning, that using Linq, lambda expressions and computing modulo in your benchmark is not such a good idea, because these operations are slow (relatively to string comparison) and make result of benchmark inaccurate.

Edward Brey
  • 40,302
  • 20
  • 199
  • 253
Ňuf
  • 6,027
  • 2
  • 23
  • 26
  • I get the same results whether I run the tests from within I ran the tests within or outside Visual Studio. In both cases, I'm building in Release mode against .NET Framework 4, and the "Optimize code" setting is on in the project file (its default). In Visual Studio, I do see the unoptimized assembly code you posted. How do you view the assembly code that gets generated when running outside Visual Studio? – Edward Brey Apr 28 '12 at 11:25
  • That's strange. I tested this benchmark on 3 different computers with different OS (Windows server 2008 x64, Windows XP x86) with different CPU's and I always get that Length==0 is faster. Additionally I turned off generation of .PDB files in Visual Studio for this project, but this probably is not an issue. Did you try another computer? I attached to running process with [WinDbg](http://archive.msdn.microsoft.com/debugtoolswindows) to see optimized assembly code. – Ňuf Apr 28 '12 at 12:42
  • I reran the experiments of inside vs. outside VS2010 and could not reproduce what I reported in my earlier comment. Now I see the same as you, that `Length` is a tad faster when outside VS2010. I also remembered this setting: Tools > Options > Debugging > General > Suppress JIT optimization on module load. I had forgotten to turn that off. When I did, I get the same results in and out of VS2010, with `Length` being faster. Additionally, by toggling that JIT optimization setting, I can reproduce all 4 of your assembly code listings all within VS2010. – Edward Brey Apr 28 '12 at 14:15
  • This answer is very interesting. So, does this mean that .NET generally will not in-line `String.IsNullOrEmpty()`? This is what it appears from the disassembly you provided of the optimized call to `IsNullOrEmpty()`. – reirab Oct 22 '14 at 19:14
  • Additional question, now writing end of 2019, is `if (mystring?.Length > 0)` faster than `if (!string.IsNullOrEmpty(mystring))` ? – Thomas Williams Dec 05 '19 at 08:30
4

Your benchmark does not measure String.IsNullOrEmpty vs String.Length, but rather how different lambda expressions are generated to functions. I.e. it is not very surprising that delegate that just contains single function call (IsNullOrEmpty) is faster than one with function call and comparison (Length == 0).

To get comparison of actuall call - write code that calls them directly without delegates.

EDIT: My rough measurements show that delegate version with IsNullOrEmpty is slightly faster then the rest, while direct calls to the same comparision are in reverse order (and about twice faster due to significantly less number of extra code) on my machine. Results likely to wary between machines, x86/x64 mode, as well between versions of runtime. For practical purposes I would consider all 4 ways are about the same if you need to use them in LINQ queries.

Overall I doubt there will be measurable difference in real program cased by choice between these methods, so pick the one that is most readable to you and use it. I generally prefer IsNullOrEmpty since it gives less chance to get ==/!= wrong in a condition.

Removal of string manipulation altogether from time critical code will likley bring much higer benifit that picking between these choices, also dropping LINQ for critical code is an option. As always - make sure to measure overall program speed in real life scenario.

Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
1

You test is wrong somethere. IsNullOrEmpty can't be faster by definition, since it makes additional null comparison operation, and then tests the Length.

So the answer can be: it's faster because of your test. However even your code shows that IsNullOrEmpty is consistently slower on my machine in both x86 and x64 modes.

Petr Abdulin
  • 33,883
  • 9
  • 62
  • 96
  • I believe IsNullOrEmpty can execute faster in the case of a null string, as the length check is not performed. Although I doubt any appreciable performance increase in performance would be observed, if the string is often expected to be null, this check may make more sense. – overslacked Apr 28 '12 at 05:03
  • 1
    I believe it's not valid to talk about case of `null` strings since `.Length` is not applicable at all in that case :) – Petr Abdulin Apr 28 '12 at 05:10
1

I believe your test is not correct:

This test shows that string.IsNullOrEmpty is always slower than s.Length==0 because it performs an additional null check:

var strings = "A,B,,C,DE,F,,G,H,,,,I,J,,K,L,MN,OP,Q,R,STU,V,W,X,Y,Z,".Split(',');
var testers = new Func<string, bool>[] { 
    s => s == String.Empty, 
    s => s.Length == 0, 
    s => String.IsNullOrEmpty(s), 
    s => s == "" ,
};
int n = testers.Length;
var stopwatches = Enumerable.Range(0, testers.Length).Select(_ => new Stopwatch()).ToArray();
int count = 0;
for(int i = 0; i < n; ++i) { // iterate testers one by one
    Stopwatch sw = stopwatches[i];
    var tester = testers[i];
    sw.Start();
    for(int j = 0; j < 10000000; ++j) // increase this count for better precision
        count += strings.Count(tester);
    sw.Stop();
}
for(int i = 0; i < testers.Length; i++)
    Console.WriteLine(stopwatches[i].ElapsedMilliseconds);

Results:

6573
5328
5488
6419

You can use s.Length==0 when you are ensure that target data does not contains null strings. In other cases I suggest you use the String.IsNullOrEmpty.

DmitryG
  • 17,677
  • 1
  • 30
  • 53
  • When I structure the test this way, I get same results, but a higher standard deviation between tests. I think it's because it is easier for other processes or OS code to influence a single tester unfairly. On average, I still end up with `IsNullOrEmpty` being faster on x86. I'm running on a 64-bit Core i5. Do you consistently find `Length` is faster? – Edward Brey Apr 28 '12 at 11:02
0

I think it is impossible IsNullOrEmpty to be faster because as all the rest said it also makes a check for null. But faster or not the difference is going to be so small, that this gives a plus on using IsNullOrEmpty just because of this additional null check that makes your code safer.

Dummy01
  • 1,985
  • 1
  • 16
  • 21
-2

In CLR via CSharp chapter 10 "Properties" Jeff Richter writes:

A property method can take a long time to execute; field access always completes immediately. A common reason to use properties is to perform thread synchronization, which can stop the thread forever, and therefore, a property should not be used if thread synchronization is required. In that situation, a method is preferred. Also, if your class can be accessed remotely (for example, your class is derived from System.MarshalByRefObject), calling the property method will be very slow, and therefore, a method is preferred to a property. In my opinion, classes derived from MarshalByRefObject should never use properties.

So if we see String.Length is property and String.IsNullOrEmpty is a method which may execute faster than the property String.Length.

Hailei
  • 42,163
  • 6
  • 44
  • 69
Marshal
  • 6,551
  • 13
  • 55
  • 91
  • 1
    A property only really exists as metadata. When you "get" the property, it calls a regular method named `get_PropertyName`, and when you "set" the property it calls a regular method named `set_PropertyName`. From the perspective of JIT and execution time, there is no difference between a property and a method. – Sam Harwell Jun 23 '12 at 22:24
-4

it may be caused by the types of the involved variables. *Empty seems to use a boolean, length an int (i guess).

Peace !

  • : edit
McRasta
  • 1
  • 1
  • 3
    -1 for guessing. And guessing incorrectly. While you are most certainly encouraged to share your knowledge in this forum, guessing at an answer adds undesirable noise to our discourse. – Bob Kaufman May 23 '12 at 18:15