6

I have some Strings consisting of only digits, and I want to split it whenever the character changes.

For example:

  • "11101100112021120" goes to: {"111", "11", "11", "2", "2", "11", "2"}
  • "222222222" goes to {"222222222"}
  • "222222122" goes to {"222222", "1", "22"}
  • "000000000" goes to {}
  • "0000100000" goes to {"1"}
  • "11121222212112133321" goes to {"111", "2", "1", "2222", "1", "2", "11", "2", "1", "333", "2", "1"}

I want a nice way to do this.

I know two ways to go about this: just brute forcing, or adding section by section. Or, I could go through and remove all 0's and replace with a 0, then add 0's when characters change, and then just do a split on 0's, but both of those ways just look dumb. If anyone has any idea on a better/prettier way to do this, regex or logic, it'd be nice.

David Cain
  • 16,484
  • 14
  • 65
  • 75
Justin Warner
  • 849
  • 2
  • 10
  • 20
  • Btw, Google links are appreciated. I couldn't find any search terms that worked. Also, I'm saving these in an ArrayList or any other easy way access them. – Justin Warner Feb 26 '13 at 23:39
  • 1
    You should change your requirements to _I have some Strings consisting of only digits, and I want to split it whenever the character changes_ **and on the digit 0**, to match your examples. – jlordo Feb 26 '13 at 23:51

2 Answers2

8

This seems to work like you expect

data.split("0+|(?<=([1-9]))(?=[1-9])(?!\\1)");

Test:

String[] tests = { "11101100112021120", "222222222", "222222122",
        "000000000", "0000100000", "11121222212112133321" };

for (String data : tests) {
    System.out.println(data + " ->" + Arrays.toString(data.split("0+|(?<=([1-9]))(?=[1-9])(?!\\1)")));
    System.out.println("-----------------------");
}

output:

11101100112021120 ->[111, 11, 11, 2, 2, 11, 2]
-----------------------
222222222 ->[222222222]
-----------------------
222222122 ->[222222, 1, 22]
-----------------------
000000000 ->[]
-----------------------
0000100000 ->[, 1]     // <-- only problem - empty first element 
-----------------------
11121222212112133321 ->[111, 2, 1, 2222, 1, 2, 11, 2, 1, 333, 2, 1]
-----------------------

Unfortunately leading zeros will let array to contain additional empty String. To get rid of it you can earlier remove these zeros with data.replaceFirst("^0+(?=[^0])", "")

Pshemo
  • 122,468
  • 25
  • 185
  • 269
  • 2
    Note that for a string of all zeroes, doing `replaceFirst("^0+", "")` will cause the split to produce an array with one empty string. Try `replaceFirst("^0+(?=[^0])", "")` instead to only do the replace when the string starts with zeroes, but isn't *all* zeroes. – matts Feb 27 '13 at 00:18
  • 1
    @matts Nice catch. It's hard to see if result of Arrays.toString() -> `[]` is empty array, or array with one empty String :) – Pshemo Feb 27 '13 at 00:30
2

Try

 str.split( "0+|(?<=(\\d))(?!\\1)" )

For strings containing zeros, you will then have to iterate through the array and remove any empty elements.

MikeM
  • 13,156
  • 2
  • 34
  • 47
  • 1
    This works as said, you will have null/empty elements, which is fine, however Pshemo's does not produce empty elements, but thank you so much! – Justin Warner Feb 27 '13 at 00:10