3

below are 2 codes

System.out.println(",,,,,".split(",").length);

and

System.out.println(",,,,, ".split(",").length);

for first code the result it prints is 0 and for the second code result it prints is 6.

My question is why the split is not able to recognize "," when I am not adding extra space at the end but it is able to recognize it when I add extra space at the end.

Please note I have tried it with regex "\\s*,\\s" also but result is same.

Kushagra Misra
  • 461
  • 1
  • 7
  • 15

4 Answers4

2

I don't have a doc reference for this, but empirically what I saw in my testing of String#split is that if there are no actual text matches, then zero-width matches are also not returned in the array. So the following returns an empty array:

",,,,,".split(",")

However, if you add a space to the end of the series of commas and then do the same split, then there is a single matching space. As a result of this, the array comes back with all matches, including zero-width matches:

",,,,, ".split(",")

But, because there is no content in between the commas, I would interpret your real requirement as wanting to split each individual comma into a separate result. If so, then you can split using lookarounds, something like this:

String input = ",,,,,";
String[] parts = input.split("(?<=,)(?=,)");
for (String part : parts) {
    System.out.println(part);
}

This outputs:

,
,
,
,
,

Demo

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • maybe you can add in the return description of [`.split`](https://docs.oracle.com/javase/9/docs/api/java/lang/String.html#split-java.lang.String-) documentation: "_the array of strings computed by splitting this string around matches of the given regular expression_" – KarelG Aug 14 '18 at 06:32
  • using this regex it dont consider the last comma, for example for string ",,"length of array should be 3 but it is showing 2 – Kushagra Misra Aug 14 '18 at 06:37
  • @KushagraMisra It is somewhat unclear what the OP wants as output at the moment, because splitting on comma would basically remove everything from the input string. I'll wait for feedback from the OP. – Tim Biegeleisen Aug 14 '18 at 06:39
  • I was just curious why it is able to split string properly when an extra space available but not when there is no space – Kushagra Misra Aug 14 '18 at 06:42
  • @KushagraMisra I don't have a doc reference for this, but it appears to be the case that if no actual text is matched, then the array returned is empty. When you added a space, you forced an actual match, and therefore the array came back with every _match_ (not that every match actual had a width). – Tim Biegeleisen Aug 14 '18 at 06:48
1

split() in java by default removes trailing empty strings from result array. To keep empty, you can use split(delimiter, limit) with limit set to negative value, like this

System.out.println(",,,,," .split(",", -1).length);
Mobility
  • 3,117
  • 18
  • 31
0

Let's explore more see the interesting results of split below:

System.out.println(",,,,,,".split(",").length); // 0
System.out.println(",,,,,, ".split(",").length); // 7
System.out.println(",,, ,,,".split(",").length); // 4
System.out.println(" ,,,,,,".split(",").length); // 1

Wondering if why it's happening this is because below statement stated for the split method in docs:

Trailing empty strings are therefore not included in the resulting array.

Docs: https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split(java.lang.String)

if you don't want the split method to remove that spaces then you should use another split with limit:

public String[] split(String regex,int limit)

Docs: https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split(java.lang.String,%20int)

Example:

System.out.println(",,,,,,".split(",",-1).length); // 7
System.out.println(",,,,,, ".split(",",-1).length); // 7
System.out.println(",,, ,,,".split(",",-1).length); // 7
System.out.println(" ,,,,,,".split(",",-1).length); // 7
Shivang Agarwal
  • 1,825
  • 1
  • 14
  • 19
0

Forget documentation, I directly looked into the code and found the following piece of code in java.lang.String#split(java.lang.String, int):-

while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
    resultSize--;
}

This proves that it is designed to remove the last element if it is empty. And keep doing it until the last element is not zero-length.

This feature is useful, for example, if you have a string a,b, it should return a and b in the resulting array and not the last blank character '' after the last comma.

If you do System.out.println(", ,,,".split(",").length); it will return 2 because the above while loop will keep decreasing the result from the right side until it finds something whose length is non-zero.

The above while loop is enclosed in if (limit == 0). So if you want to count all, use a non-zero limit. If you don't want any limit, use a negative number like -1.

Kartik
  • 7,677
  • 4
  • 28
  • 50