4

The cudaMalloc() function is defined using:

cudaMalloc ( 
  void ** devPtr, 
  size_t size)

The responses here and here give good explanations for why the function should be defined to accept a pointer to a pointer.

I am less clear, however, on why we need to type cast the arguments we supply when calling the function to be type void**. E.g. in the call to the function:

catch_status = cudaMalloc((void**)&device_array, num_bytes);

presented here.

As I understand it, defining a function that accepts void types is something which gives it more flexibility. I.e. looking at the definition of the cudaMalloc() function, I interpret it to mean that it can accept a pointer to a pointer to any type of object. As such, why should it be necessary to type cast &device_array (in the example above) when calling the function. (this syntax of such typecasting seems very prevalent in the cudaMalloc() examples I see throughout the web). As long as &device_array satisfies the condition that it is a "pointer to a pointer of any type of data", isn't that enough to (a) satisfy the function definition of the arguments cudaMalloc() accepts and (b) accomplish whatever programming objectives we have?

What am I missing here?

Community
  • 1
  • 1
Michael Ohlrogge
  • 10,559
  • 5
  • 48
  • 76
  • 6
    It's not necessary. You don't need to cast to `void **` anymore using modern CUDA versions of `cudaMalloc` -- try it and see. – Robert Crovella Jun 10 '16 at 05:24
  • 1
    It can accept a pointer to a pointer to any type of object, but `device_array` isn't a pointer to any type of object, it's a pointer to one particular type of object. – user253751 Jun 10 '16 at 08:23

1 Answers1

8

Casting to void** is always wrong as this type is not a generic pointer.

Thus when a function has a parameter of type void**, the only type of argument passed to it can be of type: void**, making any cast either wrong or unnecessary.

The correct way (ignoring error checking) of getting memory from cudaMalloc is:

void* mem;
cudaMalloc( &mem , num_int_bytes );
int* array = mem;

cudaMalloc( &mem , num_double_bytes );
double* floating = mem;