Questions tagged [cufft]

cuFFT is a FFT library for CUDA enabled GPUs. Capabilities are similar to the FFTW library.

cuFFT is a FFT library for CUDA enabled GPUs. cuFFT provides functions to do various kinds of forward and reverse Fast Fourier Transforms including multidimensional transforms and batched transforms.

146 questions
10
votes
2 answers

Why does cuFFT performance suffer with overlapping inputs?

I'm experimenting with using cuFFT's callback feature to perform input format conversion on the fly (for instance, calculating FFTs of 8-bit integer input data without first doing an explicit conversion of the input buffer to float). In many of my…
Jason R
  • 11,159
  • 6
  • 50
  • 81
9
votes
3 answers

Is it possible to call cufft library calls in device function?

I use the cuFFT library calls in a host code they work fine, but I want to call the cuFFT library from a kernel. Earlier versions of the CUDA didn't have this kind of support but with the dynamic parallelism is this possible ? It will be great if…
Sagar Masuti
  • 1,271
  • 2
  • 11
  • 30
6
votes
3 answers

CUDA cufft 2D example

I am currently working on a program that has to implement a 2D-FFT, (for cross correlation). I did a 1D FFT with CUDA which gave me the correct results, i am now trying to implement a 2D version. With few examples and documentation online i find it…
EE_Guy
  • 113
  • 1
  • 1
  • 9
6
votes
2 answers

1D FFTs of columns and rows of a 3D matrix in CUDA

I'm trying to compute batch 1D FFTs using cufftPlanMany. The data set comes from a 3D field, stored in a 1D array, where I want to compute 1D FFTs in the x and y direction. The data is stored as shown in the figure below; continuous in x then y then…
Bart
  • 9,825
  • 5
  • 47
  • 73
5
votes
3 answers

Scaling in inverse FFT by cuFFT

Whenever I'm plotting the values obtained by a programme using the cuFFT and comparing the results with that of Matlab, I'm getting the same shape of graphs and the values of maxima and minima are getting at the same points. However, the values…
Ani
  • 45
  • 1
  • 6
4
votes
2 answers

Using "cuFFT Device Callbacks"

This is my first question, so I'll try to be as detailed as possible. I'm working on implementing noise reduction algorithm in CUDA 6.5. My code is based on this Matlab implementation: http://pastebin.com/HLVq48C1. I'd love to use new cuFFT Device…
Ghany
  • 43
  • 5
4
votes
2 answers

CUFFT error handling

I'm using the following macro for CUFFT error handling: #define cufftSafeCall(err) __cufftSafeCall(err, __FILE__, __LINE__) inline void __cufftSafeCall(cufftResult err, const char *file, const int line) { if( CUFFT_SUCCESS != err) { …
Vitality
  • 20,705
  • 4
  • 108
  • 146
3
votes
1 answer

Why does the input and output for cufft greatly differ from traditional fft?

From my understanding of fft functions (eg from questions like this one) Assumming 1D fft, given N points of real data, I'll get a double sided fft of length N (but complex) + 1 for a zeroth frequency. If I take that same fft output, and run an ifft…
Krupip
  • 4,404
  • 2
  • 32
  • 54
3
votes
1 answer

Concurrency of cuFFT streams

So I am using cuFFT combined with the CUDA stream feature. The problem I have is that I can't seem to make the cuFFT kernels run in full concurrency. The following is the results I have from nvvp. Each of the stream is running a kernel of 2D batch…
Da Teng
  • 551
  • 4
  • 21
3
votes
1 answer

cuFFT R2C batch output size doesn't match input size

I'm experimenting with the batch with cuFFT. But I don't think I'm getting the right output. int NX = 16; // size of the array int BATCH = 16; // # of batch I'm allocating two arrays on the GPU: float *src; cufftComplex…
widgg
  • 1,358
  • 2
  • 16
  • 35
3
votes
0 answers

FFTW / CUFFT over given axis of multidimensional array

Is there an efficient way to use FFTW / CUFFT (they have similar APIs) to perform an fft over a given axis of a multi-dimensional array? Let's say I have a 3D array of shape (2, 3, 4). The strides are (12, 4, 1), meaning that in order to move one…
John von N.
  • 195
  • 1
  • 1
  • 6
3
votes
1 answer

Asynchronous executions of CUDA memory copies and cuFFT

I have a CUDA program for calculating FFTs of, let's say, size 50000. Currently, I copy the whole array to the GPU and execute the cuFFT. Now, I am trying to optimize the programm and the NVIDIA Visual Profiler tells me to hide the memcopy by…
user3214281
  • 77
  • 1
  • 11
3
votes
1 answer

CUFFT with double precision

I am experiencing some problems with CUDAs FFT library. I declared the inputs as cuDoubleComplex, but the compiler returns the error that this type is incompatible with parameters of type cufftComplex. After some search through the Internet, I found…
Pippo
  • 1,543
  • 3
  • 21
  • 40
3
votes
1 answer

CUFFT: How to calculate fft of pitched pointer?

I'm trying to calculate the fft of an image using CUFFT. It seems like CUFFT only offers fft of plain device pointers allocated with cudaMalloc. My input images are allocated using cudaMallocPitch but there is no option for handling pitch of the…
sgarizvi
  • 16,623
  • 9
  • 64
  • 98
2
votes
1 answer

Calculating performance of CUFFT

I am running CUFFT on chunks (N*N/p) divided in multiple GPUs, and I have a question regarding calculating the performance. First, a bit about how I am doing it: Send N*N/p chunks to each GPU Batched 1-D FFT for each row in p GPUs Get N*N/p chunks…
Sayan
  • 2,662
  • 10
  • 41
  • 56
1
2 3
9 10