I have a function to which I added the @cuda.jit decorator.
@cuda.jit
def foo(x):
bar(x[0])
bar(x[1])
bar(x[2])
def bar(x):
# Some routine
I wouldn't like to copy bar into the body of foo as that make the code clunky and ugly.
How does Numba's cuda.jit handle this? Is the function inline during compilation? Does bar need to be jitted?
If so, it's going to call other threads and I find that is overkill for a computation over 3 elements only...
I also think a cuda kernel cannot call other cuda kernels as well.
I am new to numba/cuda so pardon me if there's some fundamental mistake in understanding over here.