-2

I tried the following code with cuda 7.0.

If I set n_repeat to 1 and remove the last cudaDeviceReset, the code runs fine.

If I set n_repeat to 1 and keep the cudaDeviceReset, I can run the code segment towards the end but I got a memory leak detected by my memory leak detector after running the program.

If I set n_repeat to 2 and keep the cudaDeviceReset, I got an error in the second time I reach cublasCreate. The error code is CUBLAS_STATUS_NOT_INITIALIZED.

Can some one let me know what is the problem here and is cudaDeviceReset for the purpose of cleaning up between different runs of using the GPU, like what I'm trying to do here?

int device_id_ = 0;
cublasHandle_t blas_;
curandGenerator_t rand_gen_;
long alloc_size = 1000;
char* raw_;
int n_repeat = 2;

for (int i = 0; i < n_repeat; ++i) {
  CHECK_CUDA(cudaSetDevice(device_id_));
  CHECK_CUDA(cublasCreate(&blas_));
  CHECK_CUDA(curandCreateGenerator(&rand_gen_, CURAND_RNG_PSEUDO_DEFAULT));
  CHECK_CUDA(cudaMalloc((void **)&raw_, alloc_size));
  CHECK_CUDA(curandDestroyGenerator(rand_gen_));
  CHECK_CUDA(cublasDestroy(blas_));
  CHECK_CUDA(cudaFree(raw_));

  CHECK_CUDA(cudaDeviceReset());
}
shaoyl85
  • 1,854
  • 18
  • 30
  • I'm not sure if it can be the case here. If you create an object (a buffer for example) for which the destructor is called upon getting out-of-scope to free-up the resources, calling `cudaDeviceReset()` inside the same scope can cause problem. See the comment on [this post](http://stackoverflow.com/q/11608350/2386951). – Farzad Jul 12 '15 at 05:58
  • 1
    What's the actual problem here? Presumably your question is actually about `cublasCreate` causing a segfault if called twice? The memory leak is probably irrelevant in this case. There are lots of perfectly safe pieces of code that generate false positives in checkers like valgrind. – talonmies Jul 12 '15 at 07:45
  • 1
    when I run your program on CUDA 7.0 or CUDA 7.5RC on linux, every single API status return value is zero. What CUDA version are you using? My example is [here](http://pastebin.com/WBp67RVx). – Robert Crovella Jul 12 '15 at 14:49
  • @talonmies: is cublasCreate calling twice allowed? Or will it cause a problem? – shaoyl85 Jul 12 '15 at 23:25
  • @RobertCrovella: did you try the `n_repeat = 2` case? – shaoyl85 Jul 12 '15 at 23:26
  • 1
    Did you even look at the example? I posted the entire code I used plus the output. – Robert Crovella Jul 12 '15 at 23:43
  • @RobertCrovella Sorry Robert, and thank you for the help. I must have messed up some other configurations. – shaoyl85 Jul 13 '15 at 00:23

1 Answers1

1

I had the same problem, even with the example from Robert Crovella, cuda 7 ubuntu 14.04, K40c

Adding cudaDeviceSynchronize() after cudaSetDevice and before cublasCreate() made it work for me

AHiggins
  • 7,029
  • 6
  • 36
  • 54
charlie t
  • 21
  • 1