I have a Thrust code which loads a big array of data (2.4G) into memory, perform calculations which results are stored in host (~1.5G), then frees the initital data, load the results into device, perform other calculations on it, and finally reloads the initial data. The thrust code looks like this:
thrust::host_device<float> hostData;
// here is a code which loads ~2.4G of data into hostData
thrust::device_vector<float> deviceData = hostData;
thrust::host_vector<float> hostResult;
// here is a code which perform calculations on deviceData and copies the result to hostResult (~1.5G)
free<thrust::device_vector<float> >(deviceData);
thrust::device_vector<float> deviceResult = hostResult;
// here is code which performs calculations on deviceResult and store some results also on the device
free<thrust::device_vector<float> >(deviceResult);
deviceData = hostData;
With my defined function free:
template<class T> void free(T &V) {
V.clear();
V.shrink_to_fit();
size_t mem_tot;
size_t mem_free;
cudaMemGetInfo(&mem_free, &mem_tot);
std::cout << "Free memory : " << mem_free << std::endl;
}
template void free<thrust::device_vector<int> >(thrust::device_vector<int>& V);
template void free<thrust::device_vector<float> >(
thrust::device_vector<float>& V);
However, I get a "thrust::system::detail::bad_alloc' what(): std::bad_alloc: out of memory" error when trying to copy hostData back to deviceData even though cudaMemGetInfo returns that at this point I have ~6G of free memory of my device. Here is the complete output from the free method:
Free memory : 6295650304
Free memory : 6063775744
terminate called after throwing an instance of 'thrust::system::detail::bad_alloc'
what(): std::bad_alloc: out of memory
It seems to indicate that the device is out of memory although there is plenty free. Is it the right way to free memory for Thrust vectors? I should also note that the code works well for a smaller size of data (up to 1.5G)