3

In CUDA we can get to know about errors simply by checking return type of functions such as cudaMemcpy(), cudaMalloc() etc. which is cudaError_t with cudaSuccess. Is there any method available in JCuda to check error for functions such as cuMemcpyHtoD(), cuMemAlloc(), cuLaunchKernel() etc.

krishna
  • 413
  • 2
  • 10
  • 25

1 Answers1

4

First of all, the methods of JCuda (should) behave exactly like the corresponding CUDA functions: They return an error code in form of an int. These error codes are also defined in...

and are the same error codes as in the respective CUDA library.

All these classes additionally have a static method called stringFor(int) - for example, cudaError#stringFor(int) and CUresult#stringFor(int). These methods return a human-readable String representation of the error code.

So you could do manual error checks, for example, like this:

int error = someCudaFunction();
if (error != 0= {
    System.out.println("Error code "+error+": "+cudaError.stringFor(error));
}

which might print something like

Error code 10: cudaErrorInvalidDevice

But...

...the error checks may be a hassle. You might have noticed in the CUDA samples that NVIDIA introduced some macros that simplify the error checks. And similarly, I added optional exception checks for JCuda: All the libraries offer a static method called setExceptionsEnabled(boolean). When calling

JCudaDriver.setExceptionsEnabled(true);

then all subsequent method calls for the Driver API will automatically check the method return values, and throw a CudaException when there was any error.

(Note that this method exists separately for all libraries. E.g. the call would be JCublas.setExceptionsEnabled(true) when using JCublas)

The samples usually enable exception checks right at the beginning of the main method. And I'd recommend to also do this, at least during the development phase. As soon as it is clear that the program does not contain any errors, one could disable the exceptions, but there's hardly a reason to do so: They conveniently offer clear information about which error occurred, whereas otherwise, the calls may fail silently.

Marco13
  • 53,703
  • 9
  • 80
  • 159