4

I was learning how to accelerate python computations on GPU from this notebook, where one line confuses me:

mandel_kernel[griddim, blockdim](-2.0, 1.0, -1.0, 1.0, d_image, 20)

Here, mandel_kernel is a decorated (by cuda.jit) function, griddim and blockdim are tuples of length 2: griddim=(32,16), blockdim=(32,8).

Is this square brackets in between function name and argument list part of python syntax, or something specific to the cuda.jit decoration?

Craig Burgler
  • 1,749
  • 10
  • 19
Jason
  • 2,950
  • 2
  • 30
  • 50

2 Answers2

6

This is valid python syntax, i'll try to break it down for you:

mandel_kernel is a dict, whose keys are 2-tuples (griddim, blockdim) and values are method (which is valid since methods are objects in python)

mandel_kernel[griddim, blockdim] therefore 'returns' (or evaluates to) a method

mandel_kernel[griddim, blockdim](-2.0, 1.0, -1.0, 1.0, d_image, 20) therefore calls that method with whatever arguments are inside the parenthesis.

This one line could be rewritten in three lines like so:

key = tuple(griddim, blockdim)
method = mandel_kernel[key]
method(-2.0, 1.0, -1.0, 1.0, d_image, 20)
iCart
  • 2,179
  • 3
  • 27
  • 36
  • Thanks for the reply. So it is the decorater that turns the `mandel_kernel` function into a dict, isn't it? – Jason Sep 14 '16 at 11:33
  • No, it's just a normal `dict` with `function` `values` – Craig Burgler Sep 14 '16 at 11:35
  • Yes and no, the docs for [@cuda.jit](http://numba.pydata.org/numba-doc/dev/cuda-reference/kernel.html?highlight=cuda.jit#numba.cuda.jit) state that it returns a `AutoJitCUDAKernel` object which has a [__getitem__](https://github.com/numba/numba/blob/a3421beba7b3abc23357f78a5c416cb4ab3a49d4/numba/cuda/compiler.py#L292) method, and therefore acts a a `dict` – iCart Sep 14 '16 at 12:37
0

As iCart stated, @cuda.jit overloads the getitem method which changes the behavior of the brackets to instead of acting like a dict, the brackets are used to pass a tuple in a manner spiritually similar to the CUDA C/C++ runtime API syntax that looks like:

cudaKernel<<<numBlocks, threadsPerBlock>>>(parameter1, parameter2, ..., parameterN);

It's cute, but it's anything but clear at first glance IMO. It's doubly confusing when the tuple is computed by a lambda on the fly as it is for the triton library examples. I'm gathering they got this idea from numba.