I'm trying to split a string which is 17 bytes but when I display the length it displays 18.
String s1 = "{{ (( 4 + 5 )) }}";
String[] s2 = s1.split("");
System.out.println("length = " + s2.length);
I'm trying to split a string which is 17 bytes but when I display the length it displays 18.
String s1 = "{{ (( 4 + 5 )) }}";
String[] s2 = s1.split("");
System.out.println("length = " + s2.length);
It shows a length of 18
in Java 7 because splitting by an empty string will find a delimiter before and after every character.
{ { ( (...
^ ^ ^ ^ ^
In Java 7, trailing empty strings are discarded.
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
So, in Java 7, I get a length of 18
, because the trailing empty string is discarded, but the leading empty string is not discarded.
Including this line
System.out.println(Arrays.toString(s2));
yields this output
[, {, {, , (, (, , 4, , +, , 5, , ), ), , }, }]
with a leading empty string.
However, in Java 8, this statement is now included in the Javadocs.
When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array. A zero-width match at the beginning however never produces such empty leading substring.
It is not present in the Java 7 javadocs.
It looks like the behavior has been changed to eliminate leading strings for zero-width matches, which is the case for this question.
Java 8 output:
[{, {, , (, (, , 4, , +, , 5, , ), ), , }, }]
The beginning ,
after the array print of [
is now gone, and the length is now 17
.
It looks like you are using Java 7.
Before Java 8 "foo".split("")
split on each empty string, but since empty string exists before and after each character we effectively split in this places (marked with |
) |f|o|o|
which generates at first array like ["", "f","o","o",""]
.
Now since split
also removes empty trailing strings "foo".split("")
this array is returned ["", "f","o","o"]
, and as you see it has one empty string at start.
You can solve this problem with by splitting in place which is not at start of string. You can use split("(?<!^)")
using regex with negative-look-behind (?<!...)
which uses start of the string represented with ^
.
String[] s2 = s1.split("(?<!^)"); //for "foo" this split returns ["f","o","o"]
Others can't reproduce your problem because this behaviour changed in Java 8: Why in Java 8 split sometimes removes empty strings at start of result array?