9

I use the cuFFT library calls in a host code they work fine, but I want to call the cuFFT library from a kernel. Earlier versions of the CUDA didn't have this kind of support but with the dynamic parallelism is this possible ?

It will be great if there are any examples on how to achieve this.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
Sagar Masuti
  • 1,271
  • 2
  • 11
  • 30
  • 1
    possible duplicate of [Is there a method of FFT that will run inside CUDA Kernel?](http://stackoverflow.com/questions/11587160/is-there-a-method-of-fft-that-will-run-inside-cuda-kernel) – Pavan Yalamanchili Jun 24 '13 at 15:47
  • @PavanYalamanchili: I asked because it was more because of the dynamic parallelism is now supported with cuda 5(The link ur referring to is a year old). Anyways thanks for the info. I just got registered today with nvidia developer zone and there is new version coming out cuda 5.5 went through the release note and couldnt find anything related to device callable library functions.. :( – Sagar Masuti Jun 25 '13 at 03:34

3 Answers3

6

Despite the introduction of dynamic parallelism on Kepler (cc 3.5) cards, cuFFT remains a host API and there is currently no way of creating or executing FFT operations in device code using cuFFT.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
talonmies
  • 70,661
  • 34
  • 192
  • 269
1

I already answered this in the duplicate thread: Is there a method of FFT that will run inside CUDA Kernel?. In short, since CUDA 11.0, there is cuFFTDx (Device Extensions), which allows you to do exactly that.

Link to my answer there: https://stackoverflow.com/a/72403181/6924585.

M. Steiner
  • 161
  • 7
0

there is NO way to call the APIs from the GPU kernel. You must call them from the host. If you want to run a FFT without passing from DEVICE -> HOST -> DEVICE to continue your elaboration, the only solution is to write a kernel that performs the FFT in a device function. Actually I'm doing this because I need to run more FFTs in parallel without passing again the datas to the HOST. If you find/have another solution let me know. There are a lot of example on the web to how to achieve this: -https://hackage.haskell.org/package/pure-fft-0.2.0/docs/Numeric-FFT.html

Leos313
  • 5,152
  • 6
  • 40
  • 69
  • I didnt find any other solution. The algorithm I was implementing had an alternative analytic solution to the problem. So we used that and totally avoided this problem of FFT from the device. You could have left your answer as comment to the question. – Sagar Masuti Oct 11 '15 at 03:29