2

I have the following simple program to test cudaMemset

#include <iostream>
#include <cuda.h>
using namespace std;
__global__ void kernel(int *input){
    input[threadIdx.x] += threadIdx.x;
}
int main() {
    size_t size = 5; 
    int *h_ptr, *d_ptr;
    h_ptr = new int[size];

    cudaMalloc((void **)&d_ptr, sizeof(int) * size);
    cudaMemset(d_ptr, 10, sizeof(int) * size);

    kernel<<<1, size>>>(d_ptr);
    cudaDeviceSynchronize();

    cudaMemcpy(h_ptr, d_ptr, sizeof(int)*size, cudaMemcpyDeviceToHost);

    for(int i = 0; i < size; i++)
            cout<<h_ptr[i]<<" ";
    cout<<endl;

    return 0;
 }

I expected the result would be [10 11 12 13 14] instead I'm getting garbage values.

What is it that I am missing?

Thanks!

1 Answers1

9

cudaMemset works just like standard memset function except that it is for device memory. It sets the value of every byte of the specified memory location. You are trying to set the value of integer as a whole which is not possible with memset.

In the provided example, cudaMemset is setting the value of every byte to 10. It means the memory will be initialized like this

0A0A0A0A0A0A0A...... (in hex notation).

So when you read it as a 32-bit integer, you will get:

168430090 in decimal

The values are not garbage, these are the expected results.

[168430090 168430091 168430092 168430093 168430094]

sgarizvi
  • 16,623
  • 9
  • 64
  • 98
  • 2
    write a kernel that inizializes your array :) – DRC Jul 07 '13 at 19:22
  • Yes, an initialization kernel is required in this case. – sgarizvi Jul 07 '13 at 19:24
  • 3
    See also [memset integer array?](http://stackoverflow.com/questions/7202411/memset-integer-array) and [cudaMemset() usage](http://stackoverflow.com/questions/13387101/cudamemset-usage). – Vitality Jul 07 '13 at 19:26