2

I'm studing Aparapi (https://code.google.com/p/aparapi/) and have a strange behaviour of one of the sample included. The sample is the first, "add". Building and executing it, is ok. I also put the following code for testing if the GPU is really used

if(!kernel.getExecutionMode().equals(Kernel.EXECUTION_MODE.GPU)){
    System.out.println("Kernel did not execute on the GPU!");
}

and it works fine. But, if I try to change the size of the array from 512 to a number greater than 999 (for example 1000), I have the following output:

!!!!!!! clEnqueueNDRangeKernel() failed invalid work group size
after clEnqueueNDRangeKernel, globalSize[0] = 1000, localSize[0] = 128
Apr 18, 2013 1:31:01 PM com.amd.aparapi.KernelRunner executeOpenCL
WARNING: ### CL exec seems to have failed. Trying to revert to Java ###
JTP

Kernel did not execute on the GPU!

Here's my code:

  final int size = 1000;

  final float[] a = new float[size];
  final float[] b = new float[size];

  for (int i = 0; i < size; i++) {
     a[i] = (float)(Math.random()*100);
     b[i] = (float)(Math.random()*100);
  }

  final float[] sum = new float[size];

  Kernel kernel = new Kernel(){
     @Override public void run() {
        int gid = getGlobalId();
        sum[gid] = a[gid] + b[gid];
     }
  };

  Range range = Range.create(size);
  kernel.execute(range);

  System.out.println(kernel.getExecutionMode());
  if (!kernel.getExecutionMode().equals(Kernel.EXECUTION_MODE.GPU)){
     System.out.println("Kernel did not execute on the GPU!");
  }

  kernel.dispose();

}

I tried specifying the size using

Range range = Range.create(size, 128);

as suggested in a Google group, but nothing changed.

I'm currently running on Mac OS X 10.8 with Java 1.6.0_43. Aparapi version is the latest (2012-01-23).

Am I missing something? Any ideas?

Thanks in advance

besil
  • 1,308
  • 1
  • 18
  • 29

1 Answers1

5

Aparapi inherits a 'Grid Style' of implementation from OpenCL. When you specify a range of execution (say 1024), OpenCL will break this 'range' into groups of equal size. Possibly 4 groups of 256, or 8 groups of 128.

The group size must be a factor of range (so assert(range%groupSize==0)).

By default Aparapi internally selects the group size.

But you are choosing to fully specify the range and group size to using

Range r= Range.range(n,128)

You are responsible for ensuring that n%128==0.

From the error, it looks like you chose Range.range(1000,128).

Sadly 1000 % 128 != 0 so this range will fail.

If you specifiy

Range r = Range.range(n)

Aparapi will choose a valid group size, by finding the highest common factor of n.

Try dropping the 128 as the the second arg.

Gary

gfrost
  • 948
  • 8
  • 5
  • Thanks for the answer. I dropped 128 in Range.create() and everything works fine using a multiple of 128 as 'size' variable. You said Aparapi internally selects the group size, which must be a factor of the range. Then why the code works choosing size=999? 999%128=103, but it's executed on the GPU without errors. Do I have to use always size multiple of my localsize? How can I discover the localsize value? Thanks a lot – besil Apr 20 '13 at 15:19
  • Just use the single arg version of Range.range(size). Aparapi will choose the largest common factor for you. For 999 this should be something like 9*111. You are choosing to take control and forcing Aparapi to use localsize 128 with the Range.range(X, 128) factory method. Here Aparapi 'honors' your request, but OpenCL (rightly) rejects the request at runtime and fails to execute. Beware that on MacOSX there have been various reports of issues with the device (GPU) reporting incorrect group sizes. I am not sure you are hitting this. I have Mac OSX and your code works fine for me. – gfrost Apr 21 '13 at 13:37