0

I'm trying out some OpenCL and wondered if there is a way to pass functions as a parameter to a kernel or what is the closest available match for it (using OpenCL 1.2).

As an example consider a simple Monte Carlo integration like this:

/* this is 1/(2^32) */
#define MULTI (2.3283064365386962890625e-10)

/* for more information see: https://arxiv.org/pdf/2004.06278v2.pdf*/
uint
squares(ulong ctr, ulong key)
{
  ulong x, y, z;
  y = x = ctr * key;
  z = y + key;
  x = x * x + y;
  x = (x >> 32) | (x << 32);                /* round 1 */
  x = x * x + z; x = (x >> 32) | (x << 32); /* round 2 */
  return (x * x + y) >> 32;                 /* round 3 */
}

void
kernel
reduce(ulong  key,
       float  low,
       float  high,
       global float* partialSums,
       local  float* localSums)
{
  uint lid = get_local_id(0);

  float rand = squares(get_global_id(0), key) * MULTI;
  localSums[lid] = f((rand * (high - low)) + low);

  for (uint stride =  get_local_size(0) / 2; stride > 0; stride /= 2) {
    barrier(CLK_LOCAL_MEM_FENCE);

    if (lid < stride)
      localSums[lid] += localSums[lid + stride];
  }

  if (lid == 0)
    partialSums[get_group_id(0)] = localSums[0];
}

I found Passing a function as an argument in OpenCL which tells me that passing function pointers won't work. So i guess what would work is generating the kernel source with f defined at runtime and then compiling it (has this been done before? if so, where do i find it?). Maybe this kind of problem is easier to solve not using OpenCL but using SYCL (which i virtually know nothing about)?

I'm relatively new to this, so if this kind of problem is solved in a completely different manner, please let me know.

Malte
  • 65
  • 4

2 Answers2

1

generating the kernel source with f defined at runtime and then compiling it

Yeah it can be done. You could just create the whole source from scratch & then classic clCreateProgram + clBuildProgram.

Another option is to split your program into static & dynamically generated parts, and then compile them separately at runtime via clCompileProgram (static part just once), then link them both with clLinkProgram. This could be somewhat faster.

Maybe this kind of problem is easier to solve not using OpenCL but using SYCL

it might be actually harder to solve with SYCL; i'm not sure if SYCL supports dynamic (runtime) compilation at all.

mogu
  • 1,091
  • 6
  • 8
  • 1
    It is incorrect that SYCL would make this harder. While SYCL (note SYCL, not SyCL -- the CL does not come from OpenCL) does not have runtime compilation unless using OpenCL interoperability, it is a C++ single-source programming model, so you can have templated kernels. For example, you can instantiate your kernels with lambda functions, which might be what is needed here. So I would encourage using SYCL for use cases just like this. – illuhad Aug 04 '20 at 12:05
  • 1
    @illuhad thanks for the info. templates are nice, but they are not a replacement for runtime compilation - they are instantiated at compile time. So indeed as you say, it depends on what exactly is needed here. – mogu Aug 05 '20 at 09:22
  • 1
    sure, they cover different use cases to some extent. But OP was not specifically asking for runtime compilation per se, but more generally for a way to "pass a function as argument" which to me sounds like something that might be feasible at compile time. IMO, in the vast majority of cases customizing algorithms using templates and lambda functions is definitely the preferred way to pass a function as an argument compared to runtime compilation because of simplicity, type safety and robustness. – illuhad Aug 05 '20 at 16:40
  • I edited my question to use SYCL and not SyCL :). If at least the type of the function is known (i.e. polynomial or whatever) then I see how it makes sense to do it with compile time mechanisms. But if I don't know which shape the function will have it seems, that there is no alternative to runtime compilation isn't it? I will look nevertheless into SYCL (hipSYCL? ;) because it seems to be comfortable. – Malte Aug 05 '20 at 20:15
  • 1
    @Malte I'm not sure what you mean with "shape" of the function. You need a fixed signature for your functios. With lambda functions you could also capture additional parameters for your function. At the moment it's not clear to me why this couldn't work for arbitrary polynomials. Maybe you can describe in more detail what you need? If you are set on runtime compilation with OpenCL, you might want to look into boost.compute. AFAIK they have some convenience wrappers to inject code at runtime. Of course I would also be happy to help you with SYCL (hipSYCL or not ;) ) questions. – illuhad Aug 05 '20 at 21:29
  • Yes that is what i meant, the fixed signature requirement. What i thought is: What if I don't know in advance if the function is going to fit one of my signatures? Maybe approximate it? I don't know. I'll look again into it, when it really is a problem. – Malte Aug 06 '20 at 07:04
0

You may create an OpenCL library of functions 'f', using clCreateProgram + clLinkProgram with passed-in option "-create-library".

Following this approach for your kernel, you should pass additional integer parameter f_idx, encoding an actual instance of 'f' to be called, and in kernel body instead of actual 'f' call do a f_dispatch(f_idx, f_params). Where f_dispatch will be the function defined nearby kernel and doing a 'table-conversion' of f_idx value into actual call of certain 'f(f_params)' encoded by f_idx.

Thats the classical C-approach to do the things, and while OpenCL C is a some sort of C99, with no function pointers allowed, then it seems the reasonable way to deal with your task.

Other, more involved way is to generate as many kernels as you have various 'f' functions, and move the 'dispatch' logic onto host side, when you choosing what kernel to enqueue in order to call certain 'f'.

Mykyta Kozlov
  • 413
  • 3
  • 14