0

There are in JDK exist following split overloading signature

public String[] split(String regex, int limit)

Always I suppose that limit is the maximum number of tokens the string will be split.

for example:

first snippet:

 System.out.println(Arrays.toString("Andrew Carnegie:35:USA".split(":")));

out:

[Andrew Carnegie, 35, USA]

second snippet:

System.out.println(Arrays.toString("Andrew Carnegie:35:USA".split(":",2)));

out

[Andrew Carnegie, 35:USA]

But I noticed 1 more effect

System.out.println(Arrays.toString("Andrew Carnegie:35:USA:".split(":")));

out:

[Andrew Carnegie, 35, USA]

and

 System.out.println(Arrays.toString("Andrew Carnegie:35:USA:".split(":",-1)));

out:

[Andrew Carnegie, 35, USA, ]

Thus added an empty element if string ends by delimiter.

Where can I find specific information about this effect?

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
gstackoverflow
  • 36,709
  • 117
  • 359
  • 710
  • 3
    http://stackoverflow.com/search?q=[java]%20split%20empty – assylias Apr 21 '14 at 10:22
  • [Have a look at my answer here](http://stackoverflow.com/a/22350715/2024761). I've explained the effect along with some source code snippet :) – Rahul Apr 21 '14 at 10:26
  • possible duplicate of [Java split() method strips empty strings at the end?](http://stackoverflow.com/questions/545957/java-split-method-strips-empty-strings-at-the-end) – Pshemo Apr 21 '14 at 10:32
  • In Java 8 this mechanism also affects empty strings at start of result array if split was done on zero-width match. More info [here](http://stackoverflow.com/questions/22718744/why-does-split-in-jdk-8-sometimes-removes-empty-strings-if-they-are-at-start-of). – Pshemo Apr 21 '14 at 10:35

2 Answers2

1

Citing Johannes Weiß:

"When calling String.split(String), it calls String.split(String, 0) and that discards trailing empty strings (as the docs say it), when calling String.split(String, n) with n < 0 it won't discard anything."

Denis Kulagin
  • 8,472
  • 17
  • 60
  • 129
0

From the JavaDoc for split:

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

Emphasis mine.

So,

  • if n is strictly positive then the number of elements in the resulting array will be limited by n
  • if n is zero then the array can have any length and trailing spaces will be discarded
  • if n is strictly negative then the array can have any length and trailing spaces will not be discarded
Boris the Spider
  • 59,842
  • 6
  • 106
  • 166