Recently I have been using thrust a lot. I have noticed that in order to use thrust, one must always copy the data from the cpu memory to the gpu memory.
Let's see the following example :
int foo(int *foo)
{
host_vector<int> m(foo, foo+ 100000);
device_vector<int> s = m;
}
I'm not quite sure how the host_vector
constructor works, but it seems like I'm copying the initial data, coming from *foo
, twice - once to the host_vector when it is initialized, and another time when device_vector
is initialized. Is there a better way of copying from cpu to gpu without making an intermediate data copies? I know I can use device_ptr
as a wrapper, but that still doesn't fix my problem.
thanks!