Cost to XOR byte array seems to be very very low, lower than arraycopy

Question

I am working on an encryption class, mostly as an academic exercise and I have been experimenting to get the highest possible speed. I've found something strange in that XOR'ing a byte array has a very low cost, but using arraycopy on a byte array of the same size is more expensive. I figured it must be some JIT voodoo and was wondering if anyone might explain it.

Relevant Code:

private byte[] cryptBlock(){
    byte[] iv = Arrays.copyOf(IV, IV.length);
    iv[blocNo % BLOCKSIZE] += blockNo + 1;
    iv = Misc.cleanXOR(key, iv); //A
    iv = Arrays.copyOf(iv, BLOCKSIZE + iv.length); //B
    System.arraycopy(key, 0, iv, BLOCKSIZE, BLOCKSIZE); //C
    return Misc.cleanXOR(buffer, mD.digest(iv));
}
public static byte[] cleanXOR(byte[] a, byte[] b){
    byte[] c = new byte[a.length];
    int i=0;
    for (byte d : a)
        c[i] = (byte) (d ^ b[i++]);
    return c;
}

cryptBlock is called once every 32 bytes, I'm encrypting a 1MB byte array a couple times and averaging the time to get speed.

Commenting out line A, but not line B or C runs in the same amount of time (20MB/s) it takes as commenting out none of the lines, despite doing an XOR on about 3125000 blocks of 32 bytes.

Commenting out lines B and C but not A runs at 35MB/s

Commenting out all the lines(A, B and C) runs at 37MB/s

Can anyone explain this?

EDIT: I wrote a little arraycopy implementation to compare speeds, it runs about as fast in my code as System.arraycopy.

public static void arraycopy(byte[] source, int srcPos, byte[] dest, int destPos, int length){
        for(int i = 0; i < length; i++){
            dest[i + destPos] = source[i + srcPos];
        }
    }

Due to the complexity of the execution environment, there are many pitfalls to benchmarking Java code. Make sure to read http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java — NPE, Jan 12 '14 at 08:45
The first two lines in cryptBlock are confusing me. Arrays.copyOf creates a copy of the entire IV array, then you do copy BLOCKSIZE nr of elements from IV to iv. But iv already contains the exact same data as IV did? — Rickard, Jan 15 '14 at 15:58
@Rickard ah, didn't notice that. It was a leftover from some tweaking I was doing earlier and I forgot to take it out. Removed the line from the question. — kag0, Jan 15 '14 at 20:11

score 1 · Answer 1 · edited May 23 '17 at 11:49

1

I'm encrypting a 1MB byte array a couple times...

Due to the complexity of the Java execution environment, there are many pitfalls to benchmarking Java code.

Simply running the code a couple of times and timing it doesn't sounds like an adequate benchmarking technique.

Before you draw any conclusions from your experiments, make sure to read How do I write a correct micro-benchmark in Java? and follow the recommendations therein.

edited May 23 '17 at 11:49

Community

1
1

answered Jan 12 '14 at 08:47

NPE

486,780
108
951
1,012

What I'm running is pretty simple, just a few classes and everything is straightforward. I did add some things to my test from your link which has shown me that the garbage collector runs more when I'm calling arraycopy and everything goes a bit faster now that the warm up is in place. Just XOR is still faster than appending those two arrays though. – kag0 Jan 12 '14 at 09:07
Looking into the garbage collector more, the versions of the code that use arraycopy run the gc 5 or 6 more times when encrypting the array than just XOR. Does that sound like a reasonable explanation for the increased speed, and why does the gc run more, I'm making one new array and discarding one array either way? – kag0 Jan 12 '14 at 17:16

score 0 · Accepted Answer · answered Jan 15 '14 at 21:09

I am a complete idiot with my brain turned off. The speed difference has nothing to do with XOR vs arraycopy, and everything to do with the size of the array that goes into the messageDigest at the end of the method. Obviously hashing a BLOCKSIZE array will be faster than a 2*BLOCKSIZE array.

Cost to XOR byte array seems to be very very low, lower than arraycopy

2 Answers2