How can I check for a malloc() failure within a CUDA kernel?

Question

This is a fairly self-explanatory question. Some background info is appended.

How can I check for a malloc() failure within a CUDA kernel? I googled this and found nothing on what malloc() returns in a CUDA implementation.

In addition, I have no idea how to signal back to the host that there was an error within a CUDA kernel. How can I do this?

I thought one way would be to send an array of chars, one element for each kernel thread, and have the kernel place a 0x01 to signal an error and 0x00 for no error. Then the host could copy this memory back and check for any non zero bytes?

But this seems like a waste of memory. Is there a better way? Something like cudaThrowError()? ... maybe? ...

Appended:

I am running into trouble with a cuda error: GPUassert: the launch timed out and was terminated main.cu

If you google this, you will find info for Linux users (who have hybrid graphics solutions) - the fix is sometimes to run with optirun --no-xorg.

However in my case this isn't working.

If I run my program for a small enough data set, I get no errors. For a large enough data set, but not too large, I have to prevent time out errors by passing the --no-xorg flag. For an even larger dataset I get timeout errors regardless of the --no-xorg flag.

This hints to me that perhaps something else is going wrong?

Perhaps a malloc() failure within my kernel if I run out of memory?

I have checked my code and estimated memory usage - I don't think this is the problem, but I would like to check anyway.

how about posting a [MCVE] as [expected](http://stackoverflow.com/help/how-to-ask)? — m.s., Oct 25 '15 at 15:42
This is another question easily answered by reading some documentation. [Programming Guide, Section B18](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#dynamic-global-memory-allocation-and-operations): "The CUDA in-kernel malloc() function allocates at least size bytes from the device heap and returns a pointer to the allocated memory or NULL if insufficient memory exists to fulfill the request. " — talonmies, Oct 25 '15 at 16:16
@talonmies Good luck to most of us - how are we supposed to find anything in that awful, web based piece of garbage — FreelanceConsultant, Oct 25 '15 at 16:26
Sorry, I don't buy that as an excuse. It took me 60 seconds to find it from the time I got to the part in your question about not being able to find anything online. The programming guide is complete, has an index, and it is searchable. Failure to find is generally a direct consequence of failure to look. — talonmies, Oct 25 '15 at 16:30
@talonmies Last time I checked the search wasn't working and it wouldn't display properly anyway. Why they won't give us a PDF I will never know. — FreelanceConsultant, Oct 25 '15 at 16:35
@user3728501 But there is [a PDF](https://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf). It is linked directly from the top of the web version of the Programming Guide. This PDF is also in the `/doc/pdf` directory of your local CUDA installation. — njuffa, Oct 25 '15 at 17:21

score 3 · Accepted Answer · edited May 23 '17 at 12:23

How can I check for a malloc() failure within a CUDA kernel?

The behavior is the same as malloc on the host. If a malloc failure occurs, the returned pointer will be NULL.

So check for NULL after a malloc, and do something to address it:

#include <assert.h>

...
int *data
data = (int *)malloc(dsize*sizeof(int));
assert(data != NULL);
...rest of your code...

Notes:

It's legal to use assert in-kernel this way. If the assert is hit, your kernel will halt, and return an error to the host, which you can observe with proper cuda error checking or cuda-memcheck. This isn't the only possible way to handle a malloc failure, it's just a suggestion.
This may or may not be the problem with your actual code. This is good practice, however.

Thanks a lot, this is the sort of thing I was hoping for. – FreelanceConsultant Oct 25 '15 at 15:47 — FreelanceConsultant, Oct 25 '15 at 15:47

How can I check for a malloc() failure within a CUDA kernel?

1 Answers1