13

I'm trying to speed up the following:

string s; //--> s is never null

if (s.Length != 0)
{
   <do something>
}

Problem is, it appears the .Length actually counts the characters in the string, and this is way more work than I need. Anybody have an idea on how to speed this up?

Or, is there a way to determine if s[0] exists, w/out checking the rest of the string?

Anthony Pegram
  • 123,721
  • 27
  • 225
  • 246
Eric
  • 131
  • 1
  • 1
  • 3
  • 1
    The perf problem you face is very likely solved somewhere else! Did you profile and find that the bottleneck is really here? –  Aug 02 '10 at 17:44
  • Out of curiosity what version of the framework are you targeting? – Conrad Frix Aug 02 '10 at 18:00
  • **NOTE** for other would-be-answerers: Eric's posted an answer that clarifies his question (kinda) below.... – Dan Puzey Aug 02 '10 at 18:55

9 Answers9

24

EDIT: Now that you've provided some more context:

  • Trying to reproduce this, I failed to find a bottleneck in string.Length at all. The only way of making it faster was to comment out both the test and the body of the if block - which isn't really fair. Just commenting out the condition slowed things down, i.e. unconditionally copying the reference was slower than checking the condition.

  • As has been pointed out, using the overload of string.Split which removes empty entries for you is the real killer optimization.

  • You can go further, by avoiding creating a new char array with just a space in every time. You're always going to pass the same thing effectively, so why not take advantage of that?

  • Empty arrays are effectively immutable. You can optimize the null/empty case by always returning the same thing.

The optimized code becomes:

private static readonly char[] Delimiters = " ".ToCharArray();
private static readonly string[] EmptyArray = new string[0];

public static string[] SplitOnMultiSpaces(string text)
{
    if (string.IsNullOrEmpty(text))
    {
        return EmptyArray;
    }

    return text.Split(Delimiters, StringSplitOptions.RemoveEmptyEntries);
}

String.Length absolutely does not count the letters in the string. The value is stored as a field - although I seem to remember that the top bit of that field is used to remember whether or not all characters are ASCII (or used to be, anyway) to enable other optimisations. So the property access may need to do a bitmask, but it'll still be O(1) and I'd expect the JIT to inline it, too. (It's implemented as an extern, but hopefully that wouldn't affect the JIT in this case - I suspect it's a common enough operation to potentially have special support.)

If you already know that the string isn't null, then your existing test of

if (s.Length != 0)

is the best way to go if you're looking for raw performance IMO. Personally in most cases I'd write:

if (s != "")

to make it clearer that we're not so much interested in the length as a value as whether or not this is the empty string. That will be slightly slower than the length test, but I believe it's clearer. As ever, I'd go for the clearest code until you have benchmark/profiling data to indicate that this really is a bottleneck. I know your question is explicitly about finding the most efficient test, but I thought I'd mention this anyway. Do you have evidence that this is a bottleneck?

EDIT: Just to give clearer reasons for my suggestion of not using string.IsNullOrEmpty: a call to that method suggests to me that the caller is explicitly trying to deal with the case where the variable is null, otherwise they wouldn't have mentioned it. If at this point of the code it counts as a bug if the variable is null, then you shouldn't be trying to handle it as a normal case.

In this situation, the Length check is actually better in one way than the inequality test I've suggested: it acts as an implicit assertion that the variable isn't null. If you have a bug and it is null, the test will throw an exception and the bug will be detected early. If you use the equality test it will treat null as being different to the empty string, so it will go into your "if" statement's body. If you use string.IsNullOrEmpty it will treat null as being the same as empty, so it won't go into the block.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 2
    I think the best thing about this is the last line. – msarchet Aug 02 '10 at 17:32
  • @msarchet: It's unfortunate that I edited the answer while you were writing that comment. I assume you mean "Do you have evidence that this *is* a bottleneck?" – Jon Skeet Aug 02 '10 at 17:36
  • The post is old, but could you please give some info about the "ANSI bit"? As far as I understand the length is stored as it is before the string itself. `unsafe void PeekLength(String s) { fixed (char* st = s) { try { int* l = (int*)(st - 2); Console.WriteLine("Length is " + *l); } catch(Exception e) { Console.WriteLine(e.Message); } } }` – Sergey.quixoticaxis.Ivanov Jul 17 '17 at 12:40
  • @Sergey.quixoticaxis.Ivanov: Assuming you mean ASCII rather than ANSI, I'd expect your PeekLength method to sometimes give the wrong answer, as there used to be a top bit set to say "This string is definitely all ASCII". I don't know whether that's still the case though. – Jon Skeet Jul 17 '17 at 12:52
  • @JonSkeet sure, I've meant ASCII =) Seems like it's not the case at least now on 4.7. PeekLength prints 2 for "青空" ("blue sky"). Is it possible to read about it somewhere and the reasoning behind? – Sergey.quixoticaxis.Ivanov Jul 17 '17 at 13:10
  • 1
    @Sergey.quixoticaxis.Ivanov: That looks like a bad test case given that it's not ASCII to start with... But I can't remember where I heard about it anyway. – Jon Skeet Jul 17 '17 at 13:20
  • @JonSkeet yeap, I mentioned only obvious not ASCII because my first runs were with "aaa" and "\0\0\0\0" strings. Ok, thanks, nevermind, I'll be keeping an eye on the info about it 'cause it sounds interesting. – Sergey.quixoticaxis.Ivanov Jul 17 '17 at 14:58
  • @JonSkeet I accidentally stumbled upon [sync block implementation](https://github.com/dotnet/coreclr/blob/master/src/vm/syncblk.h). For `String` it has a `BIT_SBLK_STRING_HAS_NO_HIGH_CHARS`. The length is stored at the different (also negative) offset as far as I understand. – Sergey.quixoticaxis.Ivanov Nov 01 '17 at 02:15
11

String.IsNullOrEmpty is the preferred method for checking for null or zero length strings.

Internally, it will use Length. The Length property for a string should not be calculated on the fly though.

If you're absolutely certain that the string will never be null and you have some strong objection to String.IsNullOrEmpty, the most efficient code I can think of would be:

if(s.Length > 0)
{
    // Do Something
}

Or, possibly even better:

if(s != "")
{
    // Do Something
}
Justin Niessner
  • 242,243
  • 40
  • 408
  • 536
5

Accessing the Length property shouldn't do a count -- .NET strings store a count inside the object.

The SSCLI/Rotor source code contains an interesting comment which suggests that String.Length is (a) efficient and (b) magic:

// Gets the length of this string
//
/// This is a EE implemented function so that the JIT can recognise is specially
/// and eliminate checks on character fetchs in a loop like:
/// for(int I = 0; I < str.Length; i++) str[i]
/// The actually code generated for this will be one instruction and will be inlined.
//
public extern int Length {
    [MethodImplAttribute(MethodImplOptions.InternalCall)]
    get;
}
Tim Robinson
  • 53,480
  • 10
  • 121
  • 138
3

Here is the function String.IsNullOrEmpty -

if (!String.IsNullOrEmpty(yourstring))
{
  // your code
}
ankitjaininfo
  • 11,961
  • 7
  • 52
  • 75
  • Because you had initially posted something different which didn't compile at all. I retracted it now that you've fixed it to match what someone else posted before you. – Hut8 Aug 02 '10 at 17:50
  • Specifically this: if (![String.IsNullOrEmpty][1](yourstring)) – Hut8 Aug 02 '10 at 18:34
  • that I was trying to put hyperlink.. but then realized that code block cannot. I should have seen preview. btw, thanks for your efforts. – ankitjaininfo Aug 02 '10 at 18:40
1
String.IsNullOrWhiteSpace(s);

true if s is null or Empty, or if s consists exclusively of white-space characters.

Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
0

As always with performace: benchmark.
Using C# 3.5 or before, you'll want to test yourString.Length vs String.IsNullOrEmpty(yourString)

using C# 4, do both of the above and add String.IsNullOrWhiteSpace(yourString)

Of course, if you know your string will never be empty, you could just attempt to access s[0] and handle the exception when it's not there. That's not normally good practice, but it may be closer to what you need (if s should always have a non-blank value).

AllenG
  • 8,112
  • 29
  • 40
0
        for (int i = 0; i < 100; i++)
        {
            System.Diagnostics.Stopwatch timer = new System.Diagnostics.Stopwatch();
            string s = "dsfasdfsdafasd";

            timer.Start();
            if (s.Length > 0)
            {
            }

            timer.Stop();
            System.Diagnostics.Debug.Write(String.Format("s.Length != 0 {0} ticks       ", timer.ElapsedTicks));

            timer.Reset();
            timer.Start();
            if (s == String.Empty)
            {
            }

            timer.Stop();
            System.Diagnostics.Debug.WriteLine(String.Format("s== String.Empty {0} ticks", timer.ElapsedTicks));
        }

Using the stopwatch the s.length != 0 takes less ticks then s == String.Empty

after I fix the code

Mike
  • 5,918
  • 9
  • 57
  • 94
0

Based on your intent described in your answer, why don't you just try using this built-in option on Split:

s.Split(new[]{" "}, StringSplitOptions.RemoveEmptyEntries);
drharris
  • 11,194
  • 5
  • 43
  • 56
-1

Just use String.Split(new char[]{' '}, StringSplitOptions.RemoveEmptyEntries) and it will do it all for you.

Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
Star
  • 49
  • 1