This is a fairly self-explanatory question. Some background info is appended.
How can I check for a malloc()
failure within a CUDA kernel? I googled this and found nothing on what malloc()
returns in a CUDA implementation.
In addition, I have no idea how to signal back to the host that there was an error within a CUDA kernel. How can I do this?
I thought one way would be to send an array of chars, one element for each kernel thread, and have the kernel place a 0x01
to signal an error and 0x00
for no error. Then the host could copy this memory back and check for any non zero bytes?
But this seems like a waste of memory. Is there a better way? Something like cudaThrowError()? ... maybe? ...
Appended:
I am running into trouble with a cuda error: GPUassert: the launch timed out and was terminated main.cu
If you google this, you will find info for Linux users (who have hybrid graphics solutions) - the fix is sometimes to run with optirun --no-xorg
.
However in my case this isn't working.
If I run my program for a small enough data set, I get no errors. For a large enough data set, but not too large, I have to prevent time out errors by passing the --no-xorg
flag. For an even larger dataset I get timeout errors regardless of the --no-xorg
flag.
This hints to me that perhaps something else is going wrong?
Perhaps a malloc()
failure within my kernel if I run out of memory?
I have checked my code and estimated memory usage - I don't think this is the problem, but I would like to check anyway.