0

I currently have a C++ code which I am porting to CUDA. The cpp code uses vectors for data storage. I am fairly new to CUDA and I understand vectors cannot be directly used with CUDA.

The amount of elements to store is based on the result of some computation (Basically a threshold check. Samples greater than a threshold are stored). I understand dynamic memory allocation using malloc in the kernel is very slow. So one option is to fix the maximum number of elements, allocate mem for them and rewrite the code for arrays in place of vectors. Disadvantages here being wastage of memory since I store anywhere between 0 and 100 elements and of course that I'll have to do a lot rewriting.

The thrust library offers vectors on the device but from what I have read (on the site) people seem to shy away from thrust. Is it a reasonable solution if I include thrust/device_vector.h and thrust/host_vector.h and keep the vectors as they are? What are the disadvantages of using thrust?

Some background info: This code is part of a pipeline whose previous stages are executing in the GPU. And the reason for porting this code to GPU is to have the pipeline operating in real time (hopefully). Parallelization is done on a higher level and I will have this entire cpp code as one kernel which will run for some 800 threads (each of which represents a dispersion measure or DM). As of now, each DM is done sequentially by calling the C++ code each time.

Bart
  • 19,692
  • 7
  • 68
  • 77
Rmj
  • 15
  • 1
  • 5
  • 2
    Thrust vectors are a host side abstraction. You can't use them in device code. They are not indended as a substitute for std::vector inside a kernel, if that is what you are thinking. – talonmies Feb 02 '13 at 10:26
  • So what exactly are thrust's device vectors then? – Rmj Feb 02 '13 at 10:44
  • 2
    Like I said, they are a host side abstraction for dealing with memory and algorithms which reside on the GPU. You don't use them in device code, you use them in host code. – talonmies Feb 02 '13 at 11:16
  • See it like this: if you are perfectly happy with the functionality provided by Thrust (sorting, reduction and what not) then you never even need to know (within reason) about the gritty GPU side of things. (Or the OpenMP side for that matter) They appear like your regular host-side containers. That is the abstraction they provide. But they are not to be used in a similar manner on the device side of the equation. – Bart Feb 02 '13 at 11:32
  • Yes, I'm sorry I didn't come across that question before posting this one. Thank you both for your help! – Rmj Feb 04 '13 at 11:21

0 Answers0