Surprising Substring behavior

Question

I came across this behavior today while using the Substring method:

static void Main(string[] args) {
    string test = "123";
    for (int i = 0; true; i++) {
        try {
            Console.WriteLine("\"{0}\".Substring({1}) is \"{2}\"", test, i, test.Substring(i));
        } catch (ArgumentOutOfRangeException e) {
            Console.WriteLine("\"{0}\".Substring({1}) threw an exception.", test, i);
                break;
        }
    }
}

Output:

"123".Substring(0) is "123"
"123".Substring(1) is "23"
"123".Substring(2) is "3"
"123".Substring(3) is ""
"123".Substring(4) threw an exception.

"123".Substring(3) returns an empty string and "123".Substring(4) throws an exception. However, "123"[3] and "123"[4] are both out of bounds. This is documented on MSDN, but I'm having a hard time understanding why the Substring method is written this way. I'd expect any out-of-bounds index to either always result in an exception or always result in an empty string. Any insight?

Is your question why is Substring(3) return a different result to Substring(4)? — PriestVallon, Jul 28 '12 at 22:01
Daniel: test.Substring(3) and test.Substring(4) both supply an out-of-bounds index, but they behave differently. — Pete Schlette, Jul 28 '12 at 22:21
`SubString(3)` is only out of bounds when you expect (demand) a non-empty result. — H H, Jul 28 '12 at 22:40
Also see [Unexpected behavior of Substring in C#](https://stackoverflow.com/questions/32906320/unexpected-behavior-of-substring-in-c-sharp) — Arghya C, Apr 04 '18 at 07:31

Steve · Accepted Answer · 2012-07-28T22:43:35.073

The internal implementation of String.Substring(startindex) is like this

public string Substring(int startIndex)
{
    return this.Substring(startIndex, this.Length - startIndex);
}

So you are asking for a string of zero characters length. (A.K.A. String.Empty) I concur with you that this is not clear on MS part, but without a better explanation I think that is better to give this result than throwing an exception.

Going deeper in the implementation of String.Substring(startIndex, length) we see this code

if (length == 0)
{
    return Empty;
}

So, because length=0 is a valid input in the second overload, we get that result also for the first one.

Legionair · Answer 2 · 2012-07-28T23:37:02.613

The documentation of .Net-Substring clearly states that is throws an exception if the index is Greater than the length of the string, in the case of "123" being 3.

I guess the reason might be because of compatibility, to create the same behavior as the C++ substring function. In C++,

test.substr(3)

will return an empty string because of NULL-termination, which means the string "123" actually contains 4 characters! (the last one being \0).

That is probably the intention for having this behavior, even if .Net per specification doesnt have null-terminated strings (altough the implementation actually does...)

score 1 · Answer 3 · answered Jul 28 '12 at 22:25

1

One convenience that this implementation provides is that if you had a loop that was doing something to some arbitrary strings (for example, returning the second half of the string), you wouldn't have to handle the empty string as a special case.

answered Jul 28 '12 at 22:25

David

1,429
8
8

score 1 · Answer 4 · answered Jul 28 '12 at 22:26

Not sure why, can't think of a great reason why either but I suppose if you want to check if a substring call is at the end of a string, returning string.Empty is less expensive than throwing an exception.

Also I suppose you are just asking for the part of the string after the indexed character which would be blank, whereas the index after that is truly out of range

Surprising Substring behavior

4 Answers4

Linked

Related