5

I have a function to which I added the @cuda.jit decorator.

@cuda.jit
def foo(x):
   bar(x[0])
   bar(x[1])
   bar(x[2])

def bar(x):
  # Some routine

I wouldn't like to copy bar into the body of foo as that make the code clunky and ugly.

How does Numba's cuda.jit handle this? Is the function inline during compilation? Does bar need to be jitted?

If so, it's going to call other threads and I find that is overkill for a computation over 3 elements only...

I also think a cuda kernel cannot call other cuda kernels as well.

I am new to numba/cuda so pardon me if there's some fundamental mistake in understanding over here.

Roulbacha
  • 457
  • 7
  • 20

1 Answers1

5

How does Numba's cuda.jit handle this?

It doesn't. You would get an error if you tried

Is the function inline during compilation?

No.

Does bar need to be jitted?

Yes. It needs to be decorated with @cuda.jit(device=True)

If so, it's going to call other threads and I find that is overkill for a computation over 3 elements only...

No. A device function and a kernel are not the same thing. Code for the device function is compiled and emitted which is "single threaded".

I also think a cuda kernel cannot call other cuda kernels as well.

It can, but Numba doesn't presently support it.

talonmies
  • 70,661
  • 34
  • 192
  • 269
  • Thank you for the helpful answer, I assume cuda device functions are the way to go if I would like to factor routines inside a kernel into functions that I call multiple times? – Roulbacha May 06 '19 at 16:07
  • 1
    That is what they are designed for – talonmies May 06 '19 at 17:36
  • Does this answer also apply to normal `@jit` or `@njit` functions? As in they do not handle calls to other functions in their bodies, right? I assume this is the case, but would be good if someone can confirm. – Jethro Cao Mar 24 '20 at 20:11
  • @JethroCao you can call jit'd from eachother but you cant call python function from njit'd functions – tjaqu787 Nov 28 '22 at 02:48