Why use double pointer in cuda malloc?

Question

First of all i don't get it that if we have to use a double pointer, then why create a normal pointer and then cast it using void**? Why not simply create a double pointer in the first place?

Secondly why do we have to pass a pointer to accept the pointer returned by cudamalloc? Why can't we directly use the pointer that is returned by cudamalloc?

I completely understand how malloc works. I also get it that unlike malloc, cuda returns error code so the pointer is passed as reference. But i don't get anything beyond that?

Could you please explain everything about cudamalloc from scratch?

#include <iostream> 
#include "book.h" 

global void add( int a, int b, int c ) 
{ *c = a + b; } 

int main( void ) 
{
int c; 
int *dev_c; 

cudaMalloc( (void**)&dev_c, sizeof(int)); 

add<<<1,1>>>( 2, 7, dev_c ); 

cudaMemcpy( &c, dev_c, sizeof(int), 
cudaMemcpyDeviceToHost );

printf( "2 + 7 = %d\n", c ); 

cudaFree( dev_c ); 

return 0;
 }

Because *pass by reference*. That's how pass-by-reference is *emulated* in C, by passing a pointer to the pointer using the address-of operator. — Some programmer dude, Mar 18 '17 at 06:42
In current versions of CUDA you don't need to cast it using `(void **)` You still do need to take the address of base pointer, of course, since the function expects a pointer to pointer. — Robert Crovella, Mar 18 '17 at 11:11

Some programmer dude · Answer 1 · 2017-03-18T07:11:52.177

1

Example of pass by reference for pointers:

void my_allocate_function(void **ptr_to_ptr, size_t size)
{
    *ptr_to_ptr = malloc(size);
}

int main(void)
{
    int *ptr;
    my_allocate_function(&ptr, sizeof *ptr);  // Allocate space for a single int
}

If you declared ptr (in the main function) as a "double pointer" (i.e. int **ptr) and passed it without the address-of operator, then the my_allocate_function would dereference an uninitialized pointer and have undefined behavior.

If the my_allocate_function didn't take the pointer "by reference" then it would modify a local variable, and local variables go out of scope when the function returns and all changes to them are lost.

A little bit "graphically" look at it this way:

+------------+     +--------------------------+
| ptr_to_ptr | --> | ptr in the main function | --> ...
+------------+     +--------------------------+

By dereferencing ptr_to_ptr we get access to the location where ptr_to_ptr is pointing (which is the variable ptr in the main function), and modify what is stored in that location.

edited Mar 18 '17 at 07:11

answered Mar 18 '17 at 07:00

Some programmer dude

400,186
35
402
621

#include #include "book.h" __global__ void add( int a, int b, int *c ) { *c = a + b; } int main( void ) {int c; int *dev_c; HANDLE_ERROR( cudaMalloc( (void**)&dev_c, sizeof(int) ) ); add<<<1,1>>>( 2, 7, dev_c ); HANDLE_ERROR( cudaMemcpy( &c, dev_c, sizeof(int), cudaMemcpyDeviceToHost ) ); printf( "2 + 7 = %d\n", c ); cudaFree( dev_c ); return 0; } – chetan Mar 18 '17 at 07:23
@chetanraina If you have actual code that you wonder about, then it's probably important information that should be in the body of your question, properly formatted. So please edit your question. – Some programmer dude Mar 18 '17 at 07:26
@chetanraina Then please be patient and wait until you have access to a computer. – Some programmer dude Mar 18 '17 at 07:30
There it is. Now can u please explain me step by step what exactly happens when the cudamalloc function is executed. I am not worried about the other part. Just the cudamalloc part. – chetan Mar 18 '17 at 07:36

Why use double pointer in cuda malloc?

1 Answers1