3

Can NPP functions, more concrete npps (https://docs.nvidia.com/cuda/npp/group__npps.html) be called as a device function?

If I create a global function can I inside call npps functions as nppsMaxIndx_32f (to compute max of a vector)?

Example: I have 100 vectors of 10000 floats each, if I do it in host code I have to make 100 calls to npp function

If I make a global function of 100 threads and inside call the npp function for each vector so they launch simultaneously, will this work? nppsMaxIndx_32f can be called as a device function?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
ocerv
  • 51
  • 5
  • 1
    no, NPP functions cannot be used in device code. Neither can most other libraries (except Thrust) provided with the CUDA toolkit. – Robert Crovella Oct 25 '18 at 13:42
  • "but this can only be done if you do not need previous data for the computation." That may possibly be an incorrect or misleading statement. If you issue 2 npp calls in sequence, and the first call makes modification to data on the GPU, the second call should pick up those modifications. – Robert Crovella Oct 27 '18 at 01:05
  • Sure, thats what I was trying to say. That each call uses independent data. Thanks @RobertCrovella – ocerv Oct 28 '18 at 10:21
  • Each call doesn't have to use independent data. The results of the first call can be used by the 2nd call. – Robert Crovella Oct 28 '18 at 14:15

1 Answers1

1

This is not possible -- NPP functions are host only functions. Trying will produce errors:

functions.cu(237): error: calling a __host__ function("nppsMaxIndx_32f") from a 
__global__ function("computeMax") is notallowed

functions.cu(237): error: identifier "nppsMaxIndx_32f" is undefined in device code

However, making the call in host code without a synchronization of the GPU will call them almost simultaneously without waiting for the previous one to finish, but this can only be done safely if there is no requirement for ordering of the calls and the data for overlapping calls is fully independent.

talonmies
  • 70,661
  • 34
  • 192
  • 269
ocerv
  • 51
  • 5