5

I am trying to understand how the split method works and have a slight confusion about it. In this Example given in the documentation pages of oracle,

String str = "boo:and:foo";

String[] str1 = str.split("o",2);

Output
 b
 o:and:foo

This is easy to understand, that the string has been literally divided at the occurence of the first 'o'

but for

String[] str1 = str.split("o",3);

Output:
b

:and:foo 

How is this coming out as such?

Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
Shad
  • 1,185
  • 1
  • 12
  • 27

7 Answers7

6

What i understand from the documentation :

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

This mean devise or cut it to n time on string s, so Lets analyse one by one to understand better :

Limit 1

String[] spl1 = str.split("o", 1);

This mean split it or cut it on just one string on the string o in this case you will get all your input :

[boo:and:foo]
 1

Limit 2

String[] spl1 = str.split("o", 2);

Which mean cut it one time on o so i will put a break in the first o

    boo:and:foo
-----^

in this case you will get two results :

[b,o:and:foo]
 1 2

Limit 3

String[] spl1 = str.split("o", 3);

Which mean cut it two times on the first o and on the second o

    boo:and:foo
1----^^--------------2

in this case you will get three results :

[b, ,:and:foo]
 1 2  3

Limit 4

String[] spl1 = str.split("o", 4);

Which mean cut it three times on the first, second and third o

     boo:and:foo
1_____^^      ^
       |___2  |___3

in this case you will get four results :

[b, ,:and:f,o]
 1 2 3      4

Limit 5

String[] spl1 = str.split("o", 5);

Which mean cut it four times on first, second, third and forth o

     boo:and:foo
1_____^^      ^^
       |___2  ||___4
              |____3

in this case you will get five results :

[b, ,:and:f, , ]
 1 2  3     4 5

Just a simple animation to understand more :

How split() method actually works?

Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
1

The second parameter represents the number of times the pattern is need to apply.

From Java Docs:

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

Example:

1) if the limit is set to zero (str.split("o",0)) then according to java docs the pattern will be applied as many times as possible so the result will be :

[b, , :and:f]

2) but if you set the limit to non zero value (e.g. 1 or 2) then the pattern will be applied n-1 times (e.g. for limit 1 pattern will be applied 0 time and for 2 it will apply 1 time) so the result will be below:

[boo:and:foo] // for str.split("o",1) applied 0 time.

[b, o:and:foo] // for str.split("o",2) applied 1 time.

[b, , :and:foo] // for str.split("o",3) applied 2 time.

Raju Sharma
  • 2,496
  • 3
  • 23
  • 41
0

The second argument is the number of times the regex is applied on the string. So if the limit is 3, you'd get b,,:and:foo: Splitting the string into to token at the point of occurrence of the pattern. Note the limit could be more than the number of occurance of the regex in the actual string.

Antho Christen
  • 1,369
  • 1
  • 10
  • 21
0
String[] split(String regex, int limit)

The 2nd parameter limits the number of strings returned after split up.

For e.g. split("anydelimiter", 3) would return the array of only 3 strings even though the delimiter is present in the string more than 3 times.

If the limit is negative then the returned array would be having as many substrings as possible however when the limit is zero then the returned array would be having all the substrings excluding the trailing empty Strings.

Hitesh
  • 157
  • 3
  • 14
0

The method request for two parameters split(String regex, int limit).
regex - the delimiting regular expression;
limit - the result threshold.

Java documentation

Dumbo
  • 1,630
  • 18
  • 33
0
**regex** − the delimiting regular expression

  String Str = new String("boo:and:foo:com:boo");
 System.out.println("Return value1: ");
        for (String retval : Str.split(":", 2)) {
            System.out.println(retval);
        }
        System.out.println();
        System.out.println("Return value2: ");
        for (String retval : Str.split(":", 3)) {
            System.out.println(retval);
        }
        System.out.println();
        System.out.println("Return value3: ");
        for (String retval : Str.split(":", 0)) {
            System.out.println(retval);
        }
        System.out.println();
        System.out.println("Return value4: ");
        for (String retval : Str.split(":")) {
            System.out.println(retval);
        }
    }

Yourself test every value1,2,3,4

0

here is the documentation for limit

<p> The <tt>limit</tt> parameter controls the number of times the
     * pattern is applied and therefore affects the length of the resulting
     * array.  If the limit <i>n</i> is greater than zero then the pattern
     * will be applied at most <i>n</i>&nbsp;-&nbsp;1 times, the array's
     * length will be no greater than <i>n</i>, and the array's last entry
     * will contain all input beyond the last matched delimiter.  If <i>n</i>
     * is non-positive then the pattern will be applied as many times as
     * possible and the array can have any length.  If <i>n</i> is zero then
     * the pattern will be applied as many times as possible, the array can
     * have any length, and trailing empty strings will be discarded.

and the source code for the split method of String class.

 public String[] split(String regex, int limit) {
        /* fastpath if the regex is a
         (1)one-char String and this character is not one of the
            RegEx's meta characters ".$|()[{^?*+\\", or
         (2)two-char String and the first char is the backslash and
            the second is not the ascii digit or ascii letter.
         */
        char ch = 0;
        if (((regex.value.length == 1 &&
             ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
             (regex.length() == 2 &&
              regex.charAt(0) == '\\' &&
              (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
              ((ch-'a')|('z'-ch)) < 0 &&
              ((ch-'A')|('Z'-ch)) < 0)) &&
            (ch < Character.MIN_HIGH_SURROGATE ||
             ch > Character.MAX_LOW_SURROGATE))
        {
            int off = 0;
            int next = 0;
            boolean limited = limit > 0;
            ArrayList<String> list = new ArrayList<>();
            while ((next = indexOf(ch, off)) != -1) {
                if (!limited || list.size() < limit - 1) {
                    list.add(substring(off, next));
                    off = next + 1;
                } else {    // last one
                    //assert (list.size() == limit - 1);
                    list.add(substring(off, value.length));
                    off = value.length;
                    break;
                }
            }
            // If no match was found, return this
            if (off == 0)
                return new String[]{this};

            // Add remaining segment
            if (!limited || list.size() < limit)
                list.add(substring(off, value.length));

            // Construct result
            int resultSize = list.size();
            if (limit == 0)
                while (resultSize > 0 && list.get(resultSize - 1).length() == 0)
                    resultSize--;
            String[] result = new String[resultSize];
            return list.subList(0, resultSize).toArray(result);
        }
        return Pattern.compile(regex).split(this, limit);
    }

see this portion of code

            boolean limited = limit > 0;
            ArrayList<String> list = new ArrayList<>();
            while ((next = indexOf(ch, off)) != -1) {
                if (!limited || list.size() < limit - 1) {
                    list.add(substring(off, next));
                    off = next + 1;
                } else {    // last one
                    //assert (list.size() == limit - 1);
                    list.add(substring(off, value.length));
                    off = value.length;
                    break;
                }
            }

the resultant list size will be atmost limit.