How to Pass Vector of int into CUDA global function

Question

I'm writing my first CUDA program and encounter a lot of issues, as my main programming language is not C++.

In my console app I have a vector of int that holds a constant list of numbers. My code should create new vectors and check matches with the original constant vector.

I don't know how to pass / copy pointers of a vector into the GPU device. I get this error message after I tries to convert my code from C# into C++ and work with the Kernel:

"Error calling a host function("std::vector<int, ::std::allocator > ::vector()") from a global function("MagicSeedCUDA::bigCUDAJob") is not allowed"

This is part of my code:

std::vector<int> selectedList;
FillA1(A1, "0152793281263155465283127699107744880041");
selectedList = A1;
bigCUDAJob<< <640, 640, 640>> >(i, j, selectedList);

__global__ void bigCUDAJob(int i, int j, std::vector<int> selectedList)
    {    
        std::vector<int> tempList;
        // here comes code that adds numbers to tempList
        // code to find matches between tempList and the 
        // parameter selectedList 
    }

How to modify my code so I won't get compiler errors? I can work with array of int as well.

You can neither use `std::vector` on the GPU, nor just use normally allocated memory. If you want to use a C++ container, take a look at the Thrust library. They provide `device_vector` and `universal_vector` (and `host_vector`, but that one is less important). Even then you don't want to pass the whole vector, but just the raw CUDA pointer into your kernel (i.e. `thrust::raw_pointer_cast(my_vector.data())`). A more elegant solution is to wrap the raw pointer in the `span` implementation from [gsl-lite](https://github.com/gsl-lite/gsl-lite) which is usable in device code (kernels). — paleonix, Feb 18 '22 at 11:25
For `tempList` even Thrust is no solution (See [here](https://stackoverflow.com/a/38529755/10107454)). The Thrust vectors provide you with device memory, but its member functions are still only usable on the host. — paleonix, Feb 18 '22 at 11:35

einpoklum · Accepted Answer · 2022-02-18T17:12:54.440

I don't know how to pass / copy pointers of a vector into the GPU device

First, remind yourself of how to pass memory that's not in an std::vector to a CUDA kernel. (Re)read the vectorAdd example program, part of NVIDIA's CUDA samples.

cudaError_t status;
std::vector<int> selectedList;

// ... etc. ...

int *selectedListOnDevice = NULL;
std::size_t selectedListSizeInBytes = sizeof(int) * selectedList.size();
status = cudaMalloc((void **)&selectedListOnDevice, selectedListSizeInBytes);
if (status != cudaSuccess) { /* handle error */ }
cudaMemcpy(selectedListOnDevice, selectedList.data(), selectedListSizeInBytes);
if (status != cudaSuccess) { /* handle error */ }

// ... etc. ...

// eventually:
cudaFree(selectedListOnDevice);

That's using the official CUDA runtime API. If, however, you use my CUDA API wrappers (which you absolutely don't have to), the above becomes:

auto selectedListOnDevice = cuda::memory::make_unique<int[]>(selectedList.size());
cuda::memory::copy(selectedListOnDevice.get(), selectedList.data());

and you don't need to handle the errors yourself - on error, an exception will be thrown.

Another alternative is to use NVIDIA's thrust library, which offers an std::vector-like class called a "device vector". This allows you to write:

thrust::device_vector<int> selectedListOnDevice = selectedList;

and it should "just work".

I get this error message:

Error calling a host function("std::vector<int, ::std::allocator >
::vector()") from a global function("MagicSeedCUDA::bigCUDAJob") is
not allowed

That issue is covered in Using std::vector in CUDA device code , as @paleonix mentioned. In a nutshell: You just cannot have std::vector appear in your __device__ or __global__ functions, at all, no matter how you try and write it.

I'm writing my first CUDA program and encounter a lot of issues, as my main programming language is not C++.

Then, regardless of your specific issue with an std::vector, you should take some time to study C++ programming. Alternatively, you could brush up on C programming, as you can write CUDA kernels which are C'ish rather than C++'ish; but C++'ish features are actually quite useful when writing kernels, not just on the host-side.

How to Pass Vector of int into CUDA global function

1 Answers1