No, there is no guaranty that address spaces match. For the basic types (float, int,…) you have alignment requirement (section 6.1.5 of the standard) and you have to use the cl_type name of the OpenCL implementation (when programming in C, pyopencl does the job under the hood I’d say).
For the pointers it’s even simpler due to this mismatch. The very beginning of section 6.9 of the standard v 1.2 (it’s section 6.8 for version 1.1) states:
Arguments to kernel functions declared in a program that are pointers
must be declared with the __global, __constant or __local qualifier.
And in the point p.:
Arguments to kernel functions that are declared to be a struct or
union do not allow OpenCL objects to be passed as elements of the
struct or union.
Note also the point d.:
Variable length arrays and structures with flexible (or unsized)
arrays are not supported.
So, no way to make you kernel runs as described in your question and that's why you haven’t been able to find some examples of OpenCl struct have pointers as members.
I still can propose a workaround that takes advantage of the fact that the kernel is compiled in JIT. It still requires that you pack you data properly and that you pay attention to the alignment and finally that the size doesn’t change during the execution of the program. I honestly would go for a kernel taking 3 buffers as arguments, but anyhow, there it is.
The idea is to use the preprocessor option –D as in the following example in python:
Kernel:
typedef struct {
uint a[SIZE];
uint b[SIZE];
uint c[SIZE];
} SumParameters;
kernel void foo(global SumParameters *params){
int idx = get_global_id(0);
params->c[idx] = params->a[idx] + params->b[idx];
}
Host code:
import numpy as np
import pyopencl as cl
def bar():
mf = cl.mem_flags
ctx = cl.create_some_context()
queue = cl.CommandQueue(self.ctx)
prog_f = open('kernels.cl', 'r')
#a = (1, 2, 3), b = (4, 5, 6)
ary = np.array([(1, 2, 3), (4, 5, 6), (0, 0, 0)], dtype='uint32, uint32, uint32')
cl_ary = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=ary)
#Here should compute the size, but hardcoded for the example
size = 3
#The important part follows using -D option
prog = cl.Program(ctx, prog_f.read()).build(options="-D SIZE={0}".format(size))
prog.foo(queue, (size,), None, cl_ary)
result = np.zeros_like(ary)
cl.enqueue_copy(queue, result, cl_ary).wait()
print result
And the result:
[(1L, 2L, 3L) (4L, 5L, 6L) (5L, 7L, 9L)]