-2

I have written a simple matrix multiplication code using CUDA, when I run code for input size of A(10000*10000)*B(10000*10000), I receive this message:

cudaDeviceSynchronize returned error code 4 after launching

After adding these instructions in order to measure run time, I recieve "unspecified launch failure" error.

cudaEventRecord(start);
// here is my kernel call
cudaEventRecord(stop);
cudaEventSynchronize(stop); 

this is my kernel call:

mulKernel<<<1, dataSet.threadSize>>>(dev_c, dev_a, dev_b, dataSet.n, dataSet.m, dataSet.p, dataSet.threadSize);

and this is my kernel code:

    int i = threadIdx.x;
    int j, k, sum;
    //if(n<=threadSize)
    for(; i < n; i+=threadSize){
        for(j = 0; j < p; j++){
            sum = 0;
            for(k = 0; k < m; k++){
                sum += A[i * m + k] * B[k * p + j];
            }
            C[i *p + j] = sum;
        }
    }

How can I fix this error?

intcreator
  • 4,206
  • 4
  • 21
  • 39
vahid
  • 1
  • 1
  • 2
  • Have you tried calling `cudaGetLastError` and `cudaGetErrorString`? These should tell you what's going wrong. – jefflarkin Jan 26 '16 at 21:25

1 Answers1

1

You are launching 1 block with size dataSet.threadSize. This would be way more than the maximum number of threads in a block (1024 for Kepler GPU I think). Read more here on how to choose your grid and block dimensions.

Community
  • 1
  • 1
user3813674
  • 2,553
  • 2
  • 15
  • 26
  • This occurs even with threadSize of 512, 256 and thread size less than these numbers. – vahid Jan 27 '16 at 06:17
  • @vahid: Perhaps post more code on how you allocate memory/launch kernel. – user3813674 Jan 27 '16 at 06:47
  • memory allocation: cudaStatus = cudaMalloc((void**)&dev_a, (dataSet.m * dataSet.n)* sizeof(int)); if (cudaStatus != cudaSuccess) { fprintf(stderr, "cudaMalloc failed!"); goto Error; } memory copy: cudaStatus = cudaMemcpy(dev_a, dataSet.A, (dataSet.m * dataSet.n)* sizeof(int), cudaMemcpyHostToDevice); if (cudaStatus != cudaSuccess) { fprintf(stderr, "cudaMemcpy failed!"); goto Error; } – vahid Jan 27 '16 at 07:45