0

I have openCL kernel,

__kernel
void add(__global float* A, const int inputSize)
{
   int threadId = get_local_id(0);
   int blockSize = get_local_size(0);
   int groupId = get_group_id(0);
   int i = 2 * groupId * blockSize + threadId;

   if( i < inputSize && i + blockSize < inputSize)
       printf("%d %d\n", A[i], A[i + blockSize]);


   .....Doing some more things.....
}


Host Side Code: 

int main()
{
 ........

    //Main kernel call 
    int global_item_size = 4
    int local_item_size = 2;
    clEnqueueNDRangeKernel(command_queue,kernel, 1, NULL, &global_item_size, &local_item_size, 0, NULL, NULL);

.......
}

So number of work groups launched is 2.

Each work group has 2 threads.

Each thread processes two elements in array A.

So kernel has i and i + blockSize as index of elements processed.

inputSize is 8.

Now the issue I am facing is my this kernel works well without any error and I get proper results when I run this kernel in debug mode.The printf statement in kernel also prints proper values also if I take the values on CPU I can print them properly.

As soon as I switch to release mode. All I get is 0's in my arrays. If I print A in kernel it prints all 0's.

I am not sure what is wrong in release mode? There is definitely not sync or index issue as I am just printing the input array as soon as I come in kernel. Has some one faced similar issue?

Thanks in advance.

  • Have you checked the answers at https://stackoverflow.com/q/5782388/48660 ? Also, are you checking that your kernel compiles and builds successfully? All zeroes suggests that your kernel is never even running. – pmdj Jun 20 '20 at 18:50
  • Yes I am checking if kernel builds properly and in debug mode I can execute the kernel properly and get the correct result. Also as per link you have I might need to enable extension but then how does printed work in debug mode ? – Paritosh Kulkarni Jun 20 '20 at 18:56
  • To be clear, how are you toggling debug and release mode? Are we talking about compiler options for the *host* or when building the *kernel*? – pmdj Jun 20 '20 at 18:57
  • Yes right. Just like setting release/debug mode in visual studio. – Paritosh Kulkarni Jun 20 '20 at 18:59
  • It sounds like you probably have undefined behaviour in your host program, and your issue is nothing to do with OpenCL or the kernel code. I recommend enabling diagnostic options such as UBSan in your compiler, or running with runtime diagnostic VMs such as valgrind. – pmdj Jun 20 '20 at 19:03
  • You mean issue is in host side code? – Paritosh Kulkarni Jun 20 '20 at 19:04
  • Yes; given what you’ve posted, I don’t have much to go on but it sounds like it. Memory corruption somewhere, or something like that. At least that’s where I’d start looking for the problem. Find out where debug and release mode start diverging for example. There should be no difference, so if there is that’s usually because of some code whose behaviour is undefined. But yeah, that’s a hunch, not a definite statement. – pmdj Jun 20 '20 at 19:39

0 Answers0