0

Is there a better(Faster) way to split a binary string into an Array?

My code That loops and substring every 8 characters in one element.

binary = my binary string(Huge) : "1010101011111000001111100001110110101010101"
int index = 0;

while (index < binary.length()) {
   int num = binaryToInteger(binary.substring(index, Math.min(index + 8,binary.length())));
   l.add( num);              
   temp = temp+ String.valueOf(num);
   index += 8;
}

What I am trying to do is to split my binary string into pieces of 8 characters 10101010 and then get the int value of the 8 characters and will store that in arraylist witch in this case was l

My code is working but is very time consuming.. Is there a faster way of getting this done?

ryanyuyu
  • 6,366
  • 10
  • 48
  • 53
Renier
  • 1,738
  • 3
  • 24
  • 51
  • 1
    Your code is fine, but What is temp? – Davide Lorenzo MARINO Oct 08 '15 at 12:46
  • In any case, you will have to read and traverse through the whole binary string. – vish4071 Oct 08 '15 at 12:46
  • http://stackoverflow.com/questions/12295711/split-a-string-at-every-nth-position/12295805#12295805 – Adil Shaikh Oct 08 '15 at 12:47
  • `temp = temp+ String.valueOf(num);` is the most problematic thing here. Use a StringBuilder. Otherwise, you're creating a whole lot of temporary String objects that need to be copied at each iteration, and garbage-collected. – JB Nizet Oct 08 '15 at 12:49
  • Why do you need `temp`? You have all the data already in `l`. Maybe that `temp` is what slows things down. The rest looks okay. If your binary String is really "huge", you could avoid reading it into memory in the first place and use a `Reader` instead. – Thilo Oct 08 '15 at 12:55
  • You are right i dont need temp, I just checked something. – Renier Oct 08 '15 at 12:56

5 Answers5

2

It's easy using regex:

binary.split("(?<=\\G.{8})");

However, it creates an array of strings. I don't get your will of creating an array of integers, since binary strings don't fit into this type (they can start with "0" and they can be really long).

Andrea
  • 6,032
  • 2
  • 28
  • 55
  • If the binary String is "huge", I would avoid materializing the splitted array. Looping over substrings seems better. Ideally, the String would never be created in memory in the first place. Where does it come from? Cannot it be accessed piecemeal with a Reader? – Thilo Oct 08 '15 at 12:57
  • @Thilo: I agree about the displaying issue. I modified the code maintaining the "core" functionality. – Andrea Oct 08 '15 at 13:00
  • 1
    Wow Thanks This increased the speed allot!! – Renier Oct 08 '15 at 13:13
  • @Renier: That's the power of regex! :-D Glad to have helped. – Andrea Oct 08 '15 at 13:16
0

I think there are mutiple options using Regex, substring, split etc available in java or Google Guavas - Splitter.fixedLength().

Splitter.fixedLength(8).split("1010101011111000001111100001110110101010101");

This Split a string, at every nth position clearly explain the performance of various functions.

Community
  • 1
  • 1
Ankur Singhal
  • 26,012
  • 16
  • 82
  • 116
0

It would probably faster using toCharArray:

    Long time = System.currentTimeMillis();
    List<Integer> l = new ArrayList<>();
    int index = 0;
    String binary =
    "1010101011111000001111100001110110101";
    char[] binaryChars = binary.toCharArray();

    while (index < binaryChars.length) {
        int num = 0;
        for (int offset = 0; offset < Math.min(8, binary.length() - index); offset++) {
            int bo = index + offset;
            if (binaryChars[bo] == '1') {
                num += Math.pow(2, offset + 1);
            }
        }
        l.add(num);
        index += 8;
    }

    System.out.println(System.currentTimeMillis() - time);
Jaec
  • 390
  • 2
  • 8
  • 18
  • Why do you think, creating a copy using `toCharArray()` is faster? You can do the same by using `binary.length()` instead of `binaryChars.length` and `binary.charAt(bo)` instead of `binaryChars[bo]` and spare the entire copying operation. Besides that, if you care for performance, you shouldn’t use `Math.pow` (hint: you know what `1 << (offset+1)` does?)… – Holger Oct 08 '15 at 13:49
  • @Holger: In respect to `Math.pow` see: http://stackoverflow.com/a/29146005/221135 – Jaec Oct 08 '15 at 14:36
  • @Holger: But I think you have a point with the former, since `charAt` is an O(1) operation. – Jaec Oct 08 '15 at 14:44
  • Even an intrinsic `pow` is still way more expensive than an *integer* shift operation. The rest of that answer does not apply as here, it’s the *base* which is `2`, not the *exponent*. And while transforming `x²` to `x*x` is valid for `double` values, transforming `2^x` to `1< – Holger Oct 08 '15 at 14:47
  • What makes it strange is, that you are actually performing a bit shifting operation *on a conceptional level* as the whole operation is about bits. So writing that bit shift as a `pow` is actually making it harder to read in this case… – Holger Oct 08 '15 at 14:49
0

Since you want to split into groups of eight bits, I guess, you want to decode bytes rather than ints. This is easier than you might think:

String binary = "1010101011111000001111100001110110101010101";
byte[] result=new BigInteger(binary, 2).toByteArray();
Holger
  • 285,553
  • 42
  • 434
  • 765
-2

Maybe you can try making a for-each loop to go through each character in the string, combine them into 8-bit values and convert into bytes. I don't know if that will be faster, just a suggestion.

for (char c : binary.toCharArray() ) { do stuff }
Myzreal
  • 351
  • 4
  • 16
  • 1
    That gets only one character at a time. We want eight in each iteration. – Thilo Oct 08 '15 at 12:49
  • Yes, which is why I mentioned the "combine them into 8-bit values" part. It's not hard to hold a counter to do that. Or make a regular for loop that increments by 8 each loop. – Myzreal Oct 08 '15 at 12:51
  • ... which is exactly what the original code in the question does. – Thilo Oct 08 '15 at 12:51
  • It's not even similar actually, because the original uses multiple substring calls. This is a different approach. As mentioned, I do not know if it will be faster from the original, it's a suggestion. – Myzreal Oct 08 '15 at 12:54