2

I would like some guidance on how to split a string into N number of separate strings based on a arithmetical operation; for example string.length()/300.

I am aware of ways to do it with delimiters such as

testString.split(",");

but how does one uses greedy/reluctant/possessive quantifiers with the split method?


Update: As per request a similar example of what am looking to achieve;

String X = "32028783836295C75546F7272656E745C756E742E657865000032002E002E005C0"

Resulting in X/3 (more or less... done by hand)

X[0] = 32028783836295C75546F

X[1] = 6E745C756E742E6578650

x[2] = 65000032002E002E005C0

Dont worry about explaining how to put it into the array, I have no problem with that, only on how to split without using a delimiter, but an arithmetic operation

Carlos
  • 5,405
  • 21
  • 68
  • 114

5 Answers5

10

You could do that by splitting on (?<=\G.{5}) whereby the string aaaaabbbbbccccceeeeefff would be split into the following parts:

aaaaa
bbbbb
ccccc
eeeee
fff

The \G matches the (zero-width) position where the previous match occurred. Initially, \G starts at the beginning of the string. Note that by default the . meta char does not match line breaks, so if you want it to match every character, enable DOT-ALL: (?s)(?<=\G.{5}).

A demo:

class Main {
  public static void main(String[] args) {
    int N = 5;
    String text = "aaaaabbbbbccccceeeeefff";
    String[] tokens = text.split("(?<=\\G.{" + N + "})");
    for(String t : tokens) {
      System.out.println(t);
    }
  }
}

which can be tested online here: http://ideone.com/q6dVB

EDIT

Since you asked for documentation on regex, here are the specific tutorials for the topics the suggested regex contains:

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • thanks again, hey! great site by the way to give examples. Take it easy – Carlos Nov 08 '10 at 22:27
  • @Bart; if i may ask, could you let me know of any website u may know where i could learn this further, u seem to be very familiar with it, from ur answer and question :) http://stackoverflow.com/questions/1536915/regex-look-behind-without-obvious-maximum-length-in-java – Carlos Nov 08 '10 at 22:31
  • 1
    @Carlucho, if you're real serious, you need the book [Mastering Regular Expressions](http://oreilly.com/catalog/9781565922570) on your book-shelf. It is **the** book on regular expressions. As for online references, there are probably a lot of decent ones around, but this one is definitely in the top 5: http://www.regular-expressions.info/tutorial.html – Bart Kiers Nov 08 '10 at 22:35
  • @Bart Kiers, buddy could you explain me a little further what each of the characters does in this regex. Thanks – Carlos Nov 13 '10 at 16:54
4

If there's a fixed length that you want each String to be, you can use Guava's Splitter:

int length = string.length() / 300;
Iterable<String> splitStrings = Splitter.fixedLength(length).split(string);

Each String in splitStrings with the possible exception of the last will have a length of length. The last may have a length between 1 and length.

Note that unlike String.split, which first builds an ArrayList<String> and then uses toArray() on that to produce the final String[] result, Guava's Splitter is lazy and doesn't do anything with the input string when split is called. The actual splitting and returning of strings is done as you iterate through the resulting Iterable. This allows you to just iterate over the results without allocating a data structure and storing them all or to copy them into any kind of Collection you want without going through the intermediate ArrayList and String[]. Depending on what you want to do with the results, this can be considerably more efficient. It's also much more clear what you're doing than with a regex.

ColinD
  • 108,630
  • 30
  • 201
  • 202
2

How about plain old String.substring? It's memory friendly (as it reuses the original char array).

Landei
  • 54,104
  • 13
  • 100
  • 195
1

Dunno, you'll probably need a method that takes string and int times and returns a list of strings. Pseudo code (haven't checked if it works or not):

public String[] splintInto(String splitString, int parts)
{
   int dlength = splitString.length/parts
   ArrayList<String> retVal = new ArrayList<String>()
   for(i=0; i<splitString.length;i+=dlength)
   {
      retVal.add(splitString.substring(i,i+dlength)

   }
   return retVal.toArray()
}
Daniel Fath
  • 16,453
  • 7
  • 47
  • 82
1

well, I think this is probably as efficient a way to do this as any other.

int N=300;
int sublen = testString.length()/N;
String[] subs = new String[N];
for(int i=0; i<testString.length(); i+=sublen){
  subs[i] = testString.substring(i,i+sublen);
}

You can do it faster if you need the items as a char[] array rather as individual Strings - depending on how you need to use the results - e.g. using testString.toCharArray()

Sanjay Manohar
  • 6,920
  • 3
  • 35
  • 58