0

I am an absolute beginner when it comes to CUDA. I tried writing a simple vector summation program, following a sample program as base and it does not seem to work in the sense that cudaMalloc does not allocate memory. I am using CUDA 5.0 and I work on ubuntu 13.04 For compilation I simply type

nvcc cuda1.cu -o cuda1

The code snippet as is as follows

#include<stdio.h>
#include<cuda.h>
#include<cuda_runtime_api.h>
#define N  5

__global__ void add(int *a, int *b, int *c)
{
        int tid = blockIdx.x;
        if (tid<N)
                c[tid] = a[tid] + b[tid];
}

int main(void)
{
        int a[N],b[N],c[N];
        int *dev_a, *dev_b, *dev_c;
        if (cudaMalloc((void**)&dev_a, N * sizeof(int))!= cudaSuccess)
                printf("Could not allocate memory");
 cudaMalloc((void**)&dev_b, N * sizeof(int));
        cudaMalloc((void**)&dev_c, N * sizeof(int));
        for (int i = 0; i<N; i++)
        {
                a[i] = i;
                b[i] = i;
        }
        cudaMemcpy(dev_a, a, N * sizeof(int), cudaMemcpyHostToDevice);
        cudaMemcpy(dev_b, b, N * sizeof(int), cudaMemcpyHostToDevice);
        add<<<N,1>>>(dev_a, dev_b, dev_c);
        cudaMemcpy(c, dev_c, N * sizeof(int), cudaMemcpyDeviceToHost);
        for(int i =0; i<N; i++)
                printf("%d + %d = %d\n",a[i],b[i],c[i]);
        cudaFree(dev_a);
        cudaFree(dev_b);
        cudaFree(dev_c);
        return EXIT_SUCCESS;
}
Ujjwal Aryan
  • 3,827
  • 3
  • 20
  • 31
  • 3
    I assume you are getting the error message "Could not allocate memory". It's likely a problem with your machine. Please do [proper cuda error checking](http://stackoverflow.com/questions/14038589) to decode the error for you and give you a message with more information about what is wrong. Then post the *complete* output of your program, including all error messages. You should also try running `nvidia-smi -a` on your machine, and report back the results from that. Please edit this requested info into your question, don't try and paste into comments. (I can run your program properly). – Robert Crovella Feb 10 '14 at 15:52

1 Answers1

1

Could you change your allocation logic to

cudaError_t rc = cudaMalloc((void **) &dev_a, N*sizeof(int));

if (rc != cudaSuccess)
    printf("Could not allocate memory: %d", rc);

Maybe the returncode gives some more insight.

Hans Hohenfeld
  • 1,729
  • 11
  • 14