13

I came across this behavior today while using the Substring method:

static void Main(string[] args) {
    string test = "123";
    for (int i = 0; true; i++) {
        try {
            Console.WriteLine("\"{0}\".Substring({1}) is \"{2}\"", test, i, test.Substring(i));
        } catch (ArgumentOutOfRangeException e) {
            Console.WriteLine("\"{0}\".Substring({1}) threw an exception.", test, i);
                break;
        }
    }
}

Output:

"123".Substring(0) is "123"
"123".Substring(1) is "23"
"123".Substring(2) is "3"
"123".Substring(3) is ""
"123".Substring(4) threw an exception.

"123".Substring(3) returns an empty string and "123".Substring(4) throws an exception. However, "123"[3] and "123"[4] are both out of bounds. This is documented on MSDN, but I'm having a hard time understanding why the Substring method is written this way. I'd expect any out-of-bounds index to either always result in an exception or always result in an empty string. Any insight?

Uli Köhler
  • 13,012
  • 16
  • 70
  • 120
Pete Schlette
  • 1,948
  • 1
  • 13
  • 18

4 Answers4

15

The internal implementation of String.Substring(startindex) is like this

public string Substring(int startIndex)
{
    return this.Substring(startIndex, this.Length - startIndex);
}

So you are asking for a string of zero characters length. (A.K.A. String.Empty) I concur with you that this is not clear on MS part, but without a better explanation I think that is better to give this result than throwing an exception.

Going deeper in the implementation of String.Substring(startIndex, length) we see this code

if (length == 0)
{
    return Empty;
}

So, because length=0 is a valid input in the second overload, we get that result also for the first one.

Steve
  • 213,761
  • 22
  • 232
  • 286
3

The documentation of .Net-Substring clearly states that is throws an exception if the index is Greater than the length of the string, in the case of "123" being 3.

I guess the reason might be because of compatibility, to create the same behavior as the C++ substring function. In C++,

test.substr(3)

will return an empty string because of NULL-termination, which means the string "123" actually contains 4 characters! (the last one being \0).

That is probably the intention for having this behavior, even if .Net per specification doesnt have null-terminated strings (altough the implementation actually does...)

Legionair
  • 423
  • 3
  • 8
1

One convenience that this implementation provides is that if you had a loop that was doing something to some arbitrary strings (for example, returning the second half of the string), you wouldn't have to handle the empty string as a special case.

David
  • 1,429
  • 8
  • 8
1

Not sure why, can't think of a great reason why either but I suppose if you want to check if a substring call is at the end of a string, returning string.Empty is less expensive than throwing an exception.

Also I suppose you are just asking for the part of the string after the indexed character which would be blank, whereas the index after that is truly out of range

Charleh
  • 13,749
  • 3
  • 37
  • 57