0

I am an amateur, working on parallelizing FFT operation/execution of multiple files together. I have, say, 1000 files each having real data of different sizes i.e. if one file has some 22000 values, other file can have 15000 values, the next one can have 19000 values and so on.

Any idea on how this can be achieved? If your answer is through BATCH, please explain how?

Developer by Blood
  • 155
  • 1
  • 3
  • 11

1 Answers1

1

There are two standard solutions to your problem:

Streams: cuFFT supports CUDA streams via the cufftSetStream function. The pattern you would want to use is to assosciate each FFT with a separate stream. This may allow you to overlap processing of multiple FFTs. Furthermore, copies to and from the GPU can be overlapped with computation with minimal performance impact.

Batched: As you mention, batching is another solution. If all your FFTs are fairly similar size (as in your example) you should be able to pad smaller ones with data that won't alter/significantly alter the output, so as to make them all the same size. You can process them using a batched call.

I would have thought that in your case streams would be a better solution. This is because it allows you to transfer data to and/or from the device while performing computation, and because you won't suffer from any inefficiencies from having to do additional work on null data.

Jez
  • 1,761
  • 11
  • 14
  • NULL was what I thought of initially but it doesn't seem to be a good practice. Thanks for introducing me to cufftSetStream, will work it out and mark it as answer if it works out well. Thanks :) – Developer by Blood Aug 16 '14 at 12:13
  • it's doubtful that there would be any overlap of independent cufft transforms of size ~15000. However overlap of data copy with transforms should be possible as @JackOLantern demonstrates [here](http://stackoverflow.com/questions/25093958/asynhcronous-executions-of-cuda-memory-copies-and-cufft). And you may find other useful examples on the `cufft` tag. – Robert Crovella Aug 16 '14 at 18:56
  • @RobertCrovella Is it fine and absolutely practical/good/valid to copy and have all the 1000 files with so much data on GPU at the moment or spilitting the files into a batch (don't confuse it with the keyword of cufft)? – Developer by Blood Aug 17 '14 at 08:42
  • @Jez I studied out streams (from Robert's link), but that seems to be little inefficient as do I have to create 1000 streams for 1000 files? – Developer by Blood Aug 17 '14 at 11:26
  • You don't need 1000 streams. And you don't need all 1000 files on the GPU at once. You create a pipeline. You may need to study more about CUDA programming and CUDA programming with streams. – Robert Crovella Aug 18 '14 at 15:30