I tried the following code with cuda 7.0.
If I set n_repeat
to 1 and remove the last cudaDeviceReset
, the code runs fine.
If I set n_repeat
to 1 and keep the cudaDeviceReset
, I can run the code segment towards the end but I got a memory leak detected by my memory leak detector after running the program.
If I set n_repeat
to 2 and keep the cudaDeviceReset
, I got an error in the second time I reach cublasCreate
. The error code is CUBLAS_STATUS_NOT_INITIALIZED
.
Can some one let me know what is the problem here and is cudaDeviceReset
for the purpose of cleaning up between different runs of using the GPU, like what I'm trying to do here?
int device_id_ = 0;
cublasHandle_t blas_;
curandGenerator_t rand_gen_;
long alloc_size = 1000;
char* raw_;
int n_repeat = 2;
for (int i = 0; i < n_repeat; ++i) {
CHECK_CUDA(cudaSetDevice(device_id_));
CHECK_CUDA(cublasCreate(&blas_));
CHECK_CUDA(curandCreateGenerator(&rand_gen_, CURAND_RNG_PSEUDO_DEFAULT));
CHECK_CUDA(cudaMalloc((void **)&raw_, alloc_size));
CHECK_CUDA(curandDestroyGenerator(rand_gen_));
CHECK_CUDA(cublasDestroy(blas_));
CHECK_CUDA(cudaFree(raw_));
CHECK_CUDA(cudaDeviceReset());
}