2

I am trying to statically allocate a __shared__ variable using a device architecture of sm_86. If the size of the variable is larger than 48 KB, nvcc complains about it and fails to compile with error

Entry function uses too much shared data

I am running Visual Studio 2019 with CUDA 11.2 and have set

Configuration properties -> CUDA C/C++ -> Device -> Code Generation 

to compute_86,sm_86

According to the table here should not devices of cc 8.6 allow for a maximum shared memory per block of 100 KB?

Here is a simple code sample that fails on me:

#include "cuda_runtime.h"
#include "device_launch_parameters.h"

#define SIZE 12288 /* x 4 byte = 48 KB <---- Works */
/* #define SIZE 12289 */ /* <---- Fails */

__global__ void kernel(int* d) {
    __shared__ int s[SIZE];
    s[threadIdx.x] = 1;
    d[threadIdx.x] = s[threadIdx.x];

}

int main() {
    int* d_a;
    cudaMalloc(&d_a, SIZE * sizeof(int));
    kernel <<<1, 1024 >>> (d_a);
    cudaDeviceSynchronize();
    cudaFree(d_a);
}
paleonix
  • 2,293
  • 1
  • 13
  • 29
If_You_Say_So
  • 1,195
  • 1
  • 10
  • 25
  • 3
    note from the linked duplicate that accessing more than 48KB requires not only the opt-in mechanism via the API call, but it also requires that you use dynamically allocated shared memory. You are using statically allocated shared memory here, that is always limited to 48KB. – Robert Crovella Mar 18 '21 at 19:35
  • Understood. Thanks very much for your help. Appreciate it! – If_You_Say_So Mar 18 '21 at 19:37
  • @RobertCrovella: IIANM, your comment about static shared memory doesn't appear in the dupe - for which reason I believe the question should be reopened and the comment made an answer. – einpoklum Mar 18 '21 at 20:11
  • 2
    Take a look at the documentation quote in the linked duplicate. – Robert Crovella Mar 18 '21 at 20:14
  • @RobertCrovella: IMHO the text there is a bit obscure. I would rather we edited that question and answer to focus more on dynamic shared memory; reopened this one; and answered it to focus on static shared memory. – einpoklum Mar 19 '21 at 17:41
  • I've already updated the other answer to cover fairly explicitly both the static and dynamic case. – Robert Crovella Mar 19 '21 at 17:46

0 Answers0