0

I obtain a segmentation fault after the second cudaMalloc.

#include <cuda.h>
#include <cuda_runtime.h>

int main(){

  int n=16;

  float2* a;
  cudaMalloc((void **) a, n*sizeof(float2));
  float2* b;
  cudaMalloc((void **) b, n*sizeof(float2));

  return 0;
}

However, if I comment out any of the 2 cudaMallocs, the code runs fine.

Thanks!

  • look at [Use of cudamalloc(). Why the double pointer?](https://stackoverflow.com/questions/7989039/use-of-cudamalloc-why-the-double-pointer) if the answer is not enough for you – bruno Jan 16 '19 at 12:30
  • @bruno Yep, one of the answers is the correct one: "This is simply a horrible, horrible API design". – Lundin Jan 16 '19 at 12:38

2 Answers2

6

You have to pass a pointer to the pointer like this:

float2* a;
cudaMalloc(&a, n*sizeof(float2));
float2* b;
cudaMalloc(&b, n*sizeof(float2));

otherwise, you just cast a dangling pointer to a "pointer to pointer" and the library dereferences a garbage address leading to a segfault.

Ctx
  • 18,090
  • 24
  • 36
  • 51
  • Oh! It's true. I had not thought about it because it was already a pointer. I'm a beginner with CUDA. I have already done it the right way in the past but, this time, it slipped off my mind. Thanks! –  Jan 16 '19 at 12:34
  • This is undefined behavior... and unfortunately the modestly talented person who designed the CUDA API enforces you to write code that relies on UB. `void**` is not a generic pointer type and is not compatible with `float**`. Never was. This is a known bug in CUDA. For standard C applications, you should never do casts like these. – Lundin Jan 16 '19 at 12:36
  • @Lundin you could use `void *a; cudaMalloc(&a, 100); float *fa = a;` of course... – Ctx Jan 16 '19 at 12:41
  • @Ctx Not only could but must. – Lundin Jan 16 '19 at 12:41
  • @Lundin no, if it works, I don't have to. That depends on how portable the code needs to be. – Ctx Jan 16 '19 at 12:42
  • @Ctx Undefined behavior isn't a portability issue... unless you _know_ that your compiler has well-defined non-standard extensions for this specific case. – Lundin Jan 16 '19 at 12:44
  • @Lundin I know, what assembler code it produces, this is enough – Ctx Jan 16 '19 at 12:45
  • Actually this is not just UB but a C language violation. I have posted a correct answer, since this code here won't compile in C. – Lundin Jan 16 '19 at 12:52
  • @Lundin compiles here without problems – Ctx Jan 16 '19 at 12:57
  • Doesn't compile in gcc, clang or icc. It does compile in VS but that compiler isn't conforming to the C standard so no surprise there. – Lundin Jan 16 '19 at 13:02
  • @Ctx Passing argument from incompatible pointer type. https://godbolt.org/z/mHN_ID. This even without any warnings, `-pedantic` or `-std=c17` enabled. – Lundin Jan 16 '19 at 13:41
  • @Lundin This is a warning, "won't compile" means there is an error. Where is the error? – Ctx Jan 16 '19 at 13:46
  • @Ctx No, it is a _diagnostic message_. The C standard does not speak of errors and warnings, but diagnostic messages. If a standard violation is found, it is sufficient if the compiler gives some manner of message. The state of the generated binary, if any, is then completely undefined. If you wish gcc to give you an error for standard C violations, you must compile with `gcc -std=c17 -pedantic-errors`. – Lundin Jan 16 '19 at 14:18
  • @Lundin If you feed gcc with this code, it will compile and omit a diagnostic message. You claimed, it _won't_ compile with gcc. This was obviously wrong. Why are you not admitting to that but trying to discuss this fact away with very flimsy arguments? – Ctx Jan 16 '19 at 14:23
  • It won't compile _cleanly_, as there are standard C constraint violations. Which is the only thing that should matter. If you manage to get a compiler to produce a binary despite violating the C language, then there is no telling what said binary will do. – Lundin Jan 16 '19 at 14:27
  • @Lundin People who know their development environment usually _can_ tell what said binary will do. Especially for the above example, I can perfectly say that it will work as I expect in all build environments I ever got in touch with. – Ctx Jan 16 '19 at 14:37
  • @Ctx And what exactly will gcc do when you convert between incompatible pointer types? I can't find any documented behavior here: https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html. – Lundin Jan 16 '19 at 14:43
  • 1
    @Lundin Please, save your trick questions to other people, you exactly know what happens (or at least you should, if you're worth your salt). It produces the same assembler code, with or without a intermediate void* pointer. – Ctx Jan 16 '19 at 14:54
  • @Ctx No, I don't know that and there appears to be no guarantees about non-standard extensions. There is no trick question, you are simply assuming that your code relying on undefined behavior somehow has well-defined behavior and can provide no other argument than "it works". On your specific compiler and system, today. – Lundin Jan 16 '19 at 14:58
  • 1
    @Lundin Even if a non-standard extension for this was documented, you couldn't rely on it being present in the next version, so what? If one wants to play it safe, of course, use the intermediate void pointer. But anyone with a bit experience knows, that this construct will never be a problem in practice (you are free to show a real-world example to counter that claim). – Ctx Jan 16 '19 at 15:02
-1

Because of the broken CUDA API, the correct answer is to write a wrapper around their trash:

void* saneMalloc (size_t n)
{
  void* tmp;
  if (cudaMalloc(&tmp, n) == cudaSuccess)
    return tmp;
  return NULL;
}

...

float* a = saneMalloc(n);

You have to do this because the only generic pointer type in C is void*. You can convert from any pointer-to-type to void*, but that does not apply to void**. So if you have a float, you cannot pass on float** to a function expecting a void**. This is an incompatible pointer type.

Specifically, when passing parameters to function, they are copied as per the rules of simple assignment (C17 6.5.16.1). Passing a float** to a function expecting a void** is a constraint violation of the simple assignment rule. The code is not allowed to compile cleanly, as it is a C standard violation.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Note, that cuda does not specify that a `NULL` pointer is stored into the pointer on failure. So you have to evaluate the return value of cudaMalloc and return a NULL pointer on error "manually" – Ctx Jan 16 '19 at 12:54
  • @Ctx Added a fix for that. – Lundin Jan 16 '19 at 12:55