NOTE: while solution using split
may work (last test on Java 17) it is based on bug since look-ahead in Java should have obvious maximum length. This limitation should theoretically prevent us from using +
but somehow \G
at start lets us use +
here. In the future this bug may be fixed which means that split
will stop working.
Safer approach would be using Matcher#find
like
String data = "0,0,1,2,4,5,3,4,6";
Pattern p = Pattern.compile("\\d+,\\d+,\\d+");//no look-ahead needed
Matcher m = p.matcher(data);
List<String> parts = new ArrayList<>();
while(m.find()){
parts.add(m.group());
}
String[] result = parts.toArray(new String[0]);
You can try to use split
method with (?<=\\G\\d+,\\d+,\\d+),
regex
Demo
String data = "0,0,1,2,4,5,3,4,6";
String[] array = data.split("(?<=\\G\\d+,\\d+,\\d+),"); //Magic :)
// to reveal magic see explanation below answer
for(String s : array){
System.out.println(s);
}
output:
0,0,1
2,4,5
3,4,6
Explanation
\\d
means one digit, same as [0-9], like 0
or 3
\\d+
means one or more digits like 1
or 23
\\d+,
means one or more digits with comma after it, like 1,
or 234,
\\d+,\\d+,\\d+
will accept three numbers with commas between them like 12,3,456
\\G
means last match, or if there is none (in case of first usage) start of the string
(?<=...),
is positive look-behind which will match comma ,
that has also some string described in (?<=...)
before it
(?<=\\G\\d+,\\d+,\\d+),
so will try to find comma that has three numbers before it, and these numbers have aether start of the string before it (like ^0,0,1
in your example) or previously matched comma, like 2,4,5
and 3,4,6
.
Also in case you want to use other characters then digits you can also use other set of characters like
\\w
which will match alphabetic characters, digits and _
\\S
everything that is not white space
[^,]
everything that is not comma
- ... and so on. More info in Pattern documentation
By the way, this form will work with split on every 3rd, 5th, 7th, (and other odd numbers) comma, like split("(?<=\\G\\w+,\\w+,\\w+,\\w+,\\w+),")
will split on every 5th comma.
To split on every 2nd, 4th, 6th, 8th (and rest of even numbers) comma you will need to replace +
with {1,maxLengthOfNumber}
like split("(?<=\\G\\w{1,3},\\w{1,3},\\w{1,3},\\w{1,3}),")
to split on every 4th comma when numbers can have max 3 digits (0, 00, 12, 000, 123, 412, 999).
To split on every 2nd comma you can also use this regex split("(?<!\\G\\d+),")
based on my previous answer