93

In JavaScript this is how we can split a string at every 3-rd character

"foobarspam".match(/.{1,3}/g)

I am trying to figure out how to do this in Java. Any pointers?

Vijay Dev
  • 26,966
  • 21
  • 76
  • 96

9 Answers9

152

You could do it like this:

String s = "1234567890";
System.out.println(java.util.Arrays.toString(s.split("(?<=\\G...)")));

which produces:

[123, 456, 789, 0]

The regex (?<=\G...) matches an empty string that has the last match (\G) followed by three characters (...) before it ((?<= ))

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • Brrr... scary :) Java's regex flavor is a bit alien to me. – Vinko Vrsalovic Feb 19 '10 at 15:40
  • 3
    :) I'd probably go for Simon's solution as well: my co-workers might not like it if I start adding my regex-trickery to the code base. – Bart Kiers Feb 19 '10 at 15:43
  • Yeah, it's definitely scary. Scary, but brilliant! +1. – Vijay Dev Feb 19 '10 at 15:48
  • 18
    I'd hate to think someone voted this answer down simply because they don't like regular expressions. – William Brendel Feb 19 '10 at 15:49
  • I would stick to your's and Kenny's advices and go with a non-regex solution. Thanks for the help! – Vijay Dev Feb 19 '10 at 15:49
  • 1
    No problem, and (probably) a wise decision. – Bart Kiers Feb 19 '10 at 15:50
  • 63
    mad props for supreme regex mojo, but as a reader of this code, I'd hunt you down and egg your house. :) – Kevin Bourrillion Feb 19 '10 at 17:54
  • 4
    As long as you call this via a correctly named function (ie splitIntoParts) and don't directly embed that line in your code, it's all good. Otherwise, let the hunting begin :) – GreenieMeanie Feb 19 '10 at 17:55
  • 3
    Part of what makes this trick so scary is that it won't work in all languages. For example, JavaScript doesn't support `\G`, and Python won't split on a regex that matches zero characters. But then, if Java had a "get all matches" method like every other language does, you wouldn't have had to invent this trick in the first place, @Bart. ;) – Alan Moore Feb 20 '10 at 00:24
  • This regex looks more like the brainf**k language, but obviously it is great answer – Krzysztof Cichocki Jun 09 '17 at 05:59
  • 9
    I copy/pasted this into my Android Studio project and I get `[123, 4567890]` as result :( – Evren Yurtesen Apr 11 '18 at 19:57
  • @EvrenYurtesen no idea why that is, but here's an online demo that shows it works: https://repl.it/repls/TalkativeAmusingGlueware – Bart Kiers Apr 11 '18 at 20:10
  • @BartKiers Apparently android is broken and nobody can be bothered to report or fix it :) Maybe Oracle should sue Google for making incompatible Java. Here is somebody else having same problem without a solution: https://stackoverflow.com/questions/36030109/regex-for-string-splitting-not-working-properly – Evren Yurtesen Apr 12 '18 at 05:26
  • how can we split by dynamic number of characters instead of 3 in this case. – rafa Jun 13 '19 at 02:33
  • 2
    @rafa `s.split("(?<=\\G.{" + number + "})")` – Bart Kiers Jun 13 '19 at 07:22
  • I need a way to account for line terminators. Is that possible with the way that Java handles look-behinds? I attempted my own pattern, but it failed with the exception: `Exception in thread "main" java.util.regex.PatternSyntaxException: Look-behind group does not have an obvious maximum length near index 14 (?<=\G(.|$){10})` – Cardinal System Aug 22 '19 at 15:52
  • @CardinalSystem the $ does not match a line break, it matches the end of the input string, or the end of a line (not the line break itself!). Try this: `(?<=\G[\s\S]{10})`. The `[\s\S]` matches any character, so also line breaks. – Bart Kiers Aug 22 '19 at 23:33
93

Java does not provide very full-featured splitting utilities, so the Guava libraries do:

Iterable<String> pieces = Splitter.fixedLength(3).split(string);

Check out the Javadoc for Splitter; it's very powerful.

Pang
  • 9,564
  • 146
  • 81
  • 122
Kevin Bourrillion
  • 40,336
  • 12
  • 74
  • 87
  • 8
    +1 This is the correct answer (also known as: *know and use the libraries*) – Jonik Feb 24 '10 at 19:57
  • 4
    I would take this answer over the regex...just because it's more maintainable (e.g. the fact that less people know about RegEx than ppl being able to read "readable" code.) – sivabudh Mar 02 '10 at 00:50
  • 4
    only good if you already have Guava dependency. Otherwise, you need to add another dependency - something you should not do without checking with coworkers/system architect first. – foo Jul 22 '17 at 18:29
  • 1
    Adding a full library so you can just use one method is not the best practice in most cases, also adding a library is always a big decision in an enterprise environment. – GaboSampaio Nov 25 '19 at 18:21
56
import java.util.ArrayList;
import java.util.List;

public class Test {
    public static void main(String[] args) {
        for (String part : getParts("foobarspam", 3)) {
            System.out.println(part);
        }
    }
    private static List<String> getParts(String string, int partitionSize) {
        List<String> parts = new ArrayList<String>();
        int len = string.length();
        for (int i=0; i<len; i+=partitionSize)
        {
            parts.add(string.substring(i, Math.min(len, i + partitionSize)));
        }
        return parts;
    }
}
Simon Nickerson
  • 42,159
  • 20
  • 102
  • 127
  • If you keep a collection of substrings that cover the entire original string, the new String method will actually waste (n-1)*sizeof(int). The new Strings' char arrays will take the same memory, but each one will have a separate length field. That said, if any substrings are later discarded, new String could reduce memory. I wouldn't worry either way unless the original string is very big. – ILMTitan Feb 19 '10 at 20:58
  • @DenisTulskiy could you elaborate? The `substring` method is actually smart enough to use the parent string's `char[]` for the data; see [this answer](http://stackoverflow.com/a/2163570/732016) for more details. – wchargin Jun 04 '13 at 23:08
  • 2
    @WChargin: hmm, you're right, I have no idea why I wrote that comment. I'll delete it. Thanks. – Denis Tulskiy Jun 05 '13 at 02:54
  • I would say this answer as correct as the regex one only separates once. – Aarush Kumar Jan 27 '22 at 11:23
12

As an addition to Bart Kiers answer I want to add that it is possible instead of using the three dots ... in the regex expression which are representing three characters you can write .{3} which has the same meaning.

Then the code would look like the following:

String bitstream = "00101010001001010100101010100101010101001010100001010101010010101";
System.out.println(java.util.Arrays.toString(bitstream.split("(?<=\\G.{3})")));

With this it would be easier to modify the string length and the creation of a function is now reasonable with a variable input string length. This could be done look like the following:

public static String[] splitAfterNChars(String input, int splitLen){
    return input.split(String.format("(?<=\\G.{%1$d})", splitLen));
}

An example in IdeOne: http://ideone.com/rNlTj5

Community
  • 1
  • 1
Frodo
  • 749
  • 11
  • 23
  • it is better solution, could you please tell about Regex format? – mi_mo Nov 23 '22 at 09:44
  • As I used the same solution that was already explained from Bart Kiers, I can refer to his answer. The `%1$d` will be replaced with the decimal value of the variable `splitLen`. Otherwise [regex101.com](https://regex101.com/) could be also very helpful for you. – Frodo Nov 23 '22 at 12:59
4

Late Entry.

Following is a succinct implementation using Java8 streams, a one liner:

String foobarspam = "foobarspam";
AtomicInteger splitCounter = new AtomicInteger(0);
Collection<String> splittedStrings = foobarspam
                                    .chars()
                                    .mapToObj(_char -> String.valueOf((char)_char))
                                    .collect(Collectors.groupingBy(stringChar -> splitCounter.getAndIncrement() / 3
                                                                ,Collectors.joining()))
                                    .values();

Output:

[foo, bar, spa, m]
Pankaj Singhal
  • 15,283
  • 9
  • 47
  • 86
1

This a late answer, but I am putting it out there anyway for any new programmers to see:

If you do not want to use regular expressions, and do not wish to rely on a third party library, you can use this method instead, which takes between 89920 and 100113 nanoseconds in a 2.80 GHz CPU (less than a millisecond). It's not as pretty as Simon Nickerson's example, but it works:

   /**
     * Divides the given string into substrings each consisting of the provided
     * length(s).
     * 
     * @param string
     *            the string to split.
     * @param defaultLength
     *            the default length used for any extra substrings. If set to
     *            <code>0</code>, the last substring will start at the sum of
     *            <code>lengths</code> and end at the end of <code>string</code>.
     * @param lengths
     *            the lengths of each substring in order. If any substring is not
     *            provided a length, it will use <code>defaultLength</code>.
     * @return the array of strings computed by splitting this string into the given
     *         substring lengths.
     */
    public static String[] divideString(String string, int defaultLength, int... lengths) {
        java.util.ArrayList<String> parts = new java.util.ArrayList<String>();

        if (lengths.length == 0) {
            parts.add(string.substring(0, defaultLength));
            string = string.substring(defaultLength);
            while (string.length() > 0) {
                if (string.length() < defaultLength) {
                    parts.add(string);
                    break;
                }
                parts.add(string.substring(0, defaultLength));
                string = string.substring(defaultLength);
            }
        } else {
            for (int i = 0, temp; i < lengths.length; i++) {
                temp = lengths[i];
                if (string.length() < temp) {
                    parts.add(string);
                    break;
                }
                parts.add(string.substring(0, temp));
                string = string.substring(temp);
            }
            while (string.length() > 0) {
                if (string.length() < defaultLength || defaultLength <= 0) {
                    parts.add(string);
                    break;
                }
                parts.add(string.substring(0, defaultLength));
                string = string.substring(defaultLength);
            }
        }

        return parts.toArray(new String[parts.size()]);
    }
Cardinal System
  • 2,749
  • 3
  • 21
  • 42
1

Using plain java:

    String s = "1234567890";
    List<String> list = new Scanner(s).findAll("...").map(MatchResult::group).collect(Collectors.toList());
    System.out.printf("%s%n", list);

Produces the output:

[123, 456, 789]

Note that this discards leftover characters (0 in this case).

vishal
  • 895
  • 1
  • 9
  • 25
0

You can also split a string at every n-th character and put them each, in each index of a List :

Here I made a list of Strings named Sequence :

List < String > Sequence

Then I'm basically splitting the String "KILOSO" by every 2 words. So 'KI' 'LO' 'SO' would be incorporate in separate index of the List called Sequence.

String S = KILOSO

Sequence = Arrays.asList(S.split("(?<=\G..)"));

So when I'm doing :

System.out.print(Sequence)

It should print :

[KI, LO, SO]

to verify I can write :

System.out.print(Sequence.get(1))

it will print :

LO

0

I recently encountered this issue, and here is the solution I came up with

final int LENGTH = 10;
String test = "Here is a very long description, it is going to be past 10";

Map<Integer,StringBuilder> stringBuilderMap = new HashMap<>();
for ( int i = 0; i < test.length(); i++ ) {
    int position = i / LENGTH; // i<10 then 0, 10<=i<19 then 1, 20<=i<30 then 2, etc.

    StringBuilder currentSb = stringBuilderMap.computeIfAbsent( position, pos -> new StringBuilder() ); // find sb, or create one if not present
    currentSb.append( test.charAt( i ) ); // add the current char to our sb
}

List<String> comments = stringBuilderMap.entrySet().stream()
        .sorted( Comparator.comparing( Map.Entry::getKey ) )
        .map( entrySet -> entrySet.getValue().toString() )
        .collect( Collectors.toList() );
//done



// here you can see the data
comments.forEach( cmt -> System.out.println( String.format( "'%s' ... length= %d", cmt, cmt.length() ) ) );
// PRINTS:
// 'Here is a ' ... length= 10
// 'very long ' ... length= 10
// 'descriptio' ... length= 10
// 'n, it is g' ... length= 10
// 'oing to be' ... length= 10
// ' past 10' ... length= 8

// make sure they are equal
String joinedString = String.join( "", comments );
System.out.println( "\nOriginal strings are equal " + joinedString.equals( test ) );
// PRINTS: Original strings are equal true
RobOhRob
  • 585
  • 7
  • 17