1

I want to split a String by a space. When I use an empty string, I expect to get an array of zero strings. Instead, I get an array with only empty string. Why ?

public static void main(String [] args){
    String x = "";
    String [] xs = x.split(" ");
    System.out.println("strings :" + xs.length);//prints 1 instead of 0.
}
Tom Joe
  • 97
  • 7

3 Answers3

5

The single element string array entry is in fact empty string. This makes sense, because the split on " " fails, and hence you just get back the input with which you started. As a general approach, you may consider that if splitting returns you a single element, then the split did not match anything, leaving you with the starting input string.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • @Eng.Fouad Yes, because the split consumes the entire string, leaving nothing behind. – Tim Biegeleisen Mar 11 '20 at 17:24
  • How about `" ".split(" ", 2)`? The result is an array containing two empty strings, even though the split still consumes the entire string. – kaya3 Mar 11 '20 at 17:45
3

An interesting puzzle indeed:

> "".split(" ")
String[1] { "" }
> " ".split(" ")
String[0] {  }

The question is, when you split the empty string, why does the result contain the empty string, and when you split a space, why does the result not contain anything? It seems inconsistent, but all is explained in the documentation.

The String.split(String) method "works as if by invoking the two-argument split method with the given expression and a limit argument of zero", so let's read the docs for String.split(String, int). The case of the empty string is answered by this part:

If the expression does not match any part of the input then the resulting array has just one element, namely this string.

The empty string has no part matching a space, so the output is an array containing one element, the input string, exactly as the docs say should happen.

The case of the string " " is answered by these two parts:

A zero-width match at the beginning however never produces such empty leading substring.

If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

The whole input string " " matches the splitting pattern. In principle we could include an empty string on either side of the match, but the docs say that an empty leading substring is never included, and (because the limit parameter n = 0) the trailing empty string is also discarded. Hence, the empty strings before and after the match are both not included in the resulting array, so it's empty.

Community
  • 1
  • 1
kaya3
  • 47,440
  • 4
  • 68
  • 97
1

It appears that since the String exists and it cannot be split (there are no spaces), it simply places the entire String into the first array position, causing there to be one. If you were to instead try

String x = " ";
String [] xs = x.split(" ");
System.out.println("strings :" + xs.length);//prints 1 instead of 0.

It will give you the zero you are expecting.

See also: Java String split removed empty values

MrsNickalo
  • 207
  • 1
  • 6
  • My goal is not to somehow print a zero. I want to understand the behavior when an empty string is passed. – Tom Joe Mar 11 '20 at 17:23
  • @TomJoe "places the entire String into the first array position" that's the behaviour when an empty string is passed - places the entire String (an empty String) into the first array position – MT756 Mar 11 '20 at 17:26