Considering the following host function:
uint64_t * SomeDevPtr =...
/* Where SomeDevPtr is a pointer pointed to some device memory address allocated by cudaMalloc(); */
uint32_t * SomeDevIntPtr = reintepret_cast<uint32_t *>(SomeDevPtr);
Because of the function, cudaMalloc
will automatcially fullfill some aligment requirements (I think it is aligned to some 128 byte memory boundary), therefore I think both SomeDevIntPtr
and SomeDevPtr
should be start at exact the same physical memory address at GPU's global memory, am I correct on this?
I just want to make sure about that since some of the functions I wrote depend on it.