tl;dr: You can't do this, and it wouldn't help you if you could.
The code of most standard-library containers is CPU-specific - and none of them have non-CPU-specific parts marked with __host__ __device__
and compiled to be usable in kernels (And that's also the case for the <algorithm>
code). So, technically, no. (Caveat: Things will be a bit more complicated in C++20 with ubiquitous constexpr
ing.)
Also, most of these containers are not designed with parallel or concurrent execution in mind: Adding or removing elements to an std::vector
or an std::map
by two non-serialized CPU or GPU threads will most likely result in data corruption and possibly even worse. So, you don't want to do that even on the CPU.
Another point to remember is memory allocation which is done differently on a GPU and on a CPU; and that mostly you want to avoid dynamic memory allocation within GPU kernels.
But, you asked, what about using the raw data of a map-of-vectors rather than the code?
Well, if you have a map-of-vectors data structure in main system memory, you will not get any speedup from using a GPU to search it. More generally, it is unlikely you will speed up searches of main-memory structures using a discrete GPU: On common hardware platforms, the CPU offers you higher bandwidth and lower latency for main memory access than the GPU, and search is typically about sporadic non-consecutive memory accesses, so your hopes will be frustrated.