I have a simple application that (for now) simulates error correction in a large array.
This bit generates the data and adds 16 bytes of Reed-Solomon parity to each block of 255 bytes.
ReedSolomonEncoder encoder = new ReedSolomonEncoder(QR_CODE_FIELD_256);
int[][] data = new int[params.getNumBlocks()][255];
int[][] original = new int[params.getNumBlocks()][];
int value = 0;
for (int i = 0; i < params.getNumBlocks(); i++) {
int[] block = data[i];
for (int j = 0; j < 239; j++) {
value = (value + 1) % 256;
block[j] = value;
}
encoder.encode(block, 16);
original[i] = Arrays.copyOf(block, block.length);
// Corrupt a byte
block[50] += 1;
}
This is my kernel:
public class RsKernel implements Kernel {
private final int[] block;
public RsKernel(int[] block) {
this.block = block;
}
@Override
public void gpuMethod() {
block[50] -= 1;
}
}
it merely manually reverts the corrupted byte in each block (it doesn't do actual Reed-Solomon error-correction).
I run the kernels with the following code:
ArrayList<Kernel> kernels = new ArrayList<>(params.getNumBlocks());
for (int[] block : data) {
kernels.add(new RsKernel(block));
}
new Rootbeer().run(kernels);
And I verify decoding with JUnit
's assertArrayEquals
:
Assert.assertArrayEquals(original, data);
The curious bit is that if I run this code with up to 8192 (what a suspiciously convenient number) blocks (kernels), the data is reported to have been decoded correctly; for 8193 blocks and above, it is not decoded correctly:
Exception in thread "main" arrays first differed at element [8192][50]; expected:<51> but was:<52>
at org.junit.Assert.internalArrayEquals(Assert.java:437)
at org.junit.Assert.internalArrayEquals(Assert.java:428)
at org.junit.Assert.assertArrayEquals(Assert.java:167)
at org.junit.Assert.assertArrayEquals(Assert.java:184)
at com.amphinicy.blink.rootbeer.RootBeerDemo.main(Jasmin)
What could cause this behaviour?
Here is the output of java -jar rootbeer-1.1.14.jar -printdeviceinfo
:
device count: 1
device: GeForce GT 525M
compute_capability: 2.1
total_global_memory: 1073414144 bytes
num_multiprocessors: 2
max_threads_per_multiprocessor: 1536
clock_rate: 1200000 Hz