Background
I am working with a delimited string and was using String.Split to put each substring into an array when I noticed that the last spot in the array was ""
. It was throwing off my results since I was looking for a specific substring at the last index in the array and I eventually came across this post explaining all strings end with string.Empty
.
Example
The following shows this behavior in action. When I split my sentence and write each substring to the console, we can see the last element is the empty string:
public class Program
{
static void Main(string[] args)
{
const string mySentence = "Hello,this,is,my,string!";
var wordArray = mySentence.Split(new[] {",", "!"}, StringSplitOptions.None);
foreach (var word in wordArray)
{
var message = word;
if (word == string.Empty) message = "Empty string";
Console.WriteLine(message);
}
Console.ReadKey();
}
}
Question & "Fix"
I get conceptually that there are empty strings between every character, but why does String
behave like this even for the end of a string? It seems confusing that "ABC"
is equivalent to "ABC" + ""
or ABC + "" + "" + ""
so why not treat the string literally as only "ABC"
?
There is a "fix" around it to get the "true" substrings I wanted:
public class Program
{
static void Main(string[] args)
{
const string mySentence = "Hello,this,is,my,string!";
var wordArray = mySentence.Split(new[] {",", "!"}, StringSplitOptions.None);
var wordList = new List<string>();
wordList.AddRange(wordArray);
wordList.RemoveAt(wordList.LastIndexOf(string.Empty));
foreach (var word in wordList)
{
var message = word;
if (word == string.Empty) message = "Empty string";
Console.WriteLine(message);
}
Console.ReadKey();
}
}
But I still don't understand why the end of the string gets treated with the same behavior since there is not another character following it where an empty string would be needed. Does it serve some purpose for the compiler?