Error when trying to use more than 48 KB of (static) shared memory

Question

I am trying to statically allocate a __shared__ variable using a device architecture of sm_86. If the size of the variable is larger than 48 KB, nvcc complains about it and fails to compile with error

Entry function uses too much shared data

I am running Visual Studio 2019 with CUDA 11.2 and have set

Configuration properties -> CUDA C/C++ -> Device -> Code Generation

to compute_86,sm_86

According to the table here should not devices of cc 8.6 allow for a maximum shared memory per block of 100 KB?

Here is a simple code sample that fails on me:

#include "cuda_runtime.h"
#include "device_launch_parameters.h"

#define SIZE 12288 /* x 4 byte = 48 KB <---- Works */
/* #define SIZE 12289 */ /* <---- Fails */

__global__ void kernel(int* d) {
    __shared__ int s[SIZE];
    s[threadIdx.x] = 1;
    d[threadIdx.x] = s[threadIdx.x];

}

int main() {
    int* d_a;
    cudaMalloc(&d_a, SIZE * sizeof(int));
    kernel <<<1, 1024 >>> (d_a);
    cudaDeviceSynchronize();
    cudaFree(d_a);
}

note from the linked duplicate that accessing more than 48KB requires not only the opt-in mechanism via the API call, but it also requires that you use dynamically allocated shared memory. You are using statically allocated shared memory here, that is always limited to 48KB. — Robert Crovella, Mar 18 '21 at 19:35
@RobertCrovella: IIANM, your comment about static shared memory doesn't appear in the dupe - for which reason I believe the question should be reopened and the comment made an answer. — einpoklum, Mar 18 '21 at 20:11
Take a look at the documentation quote in the linked duplicate. — Robert Crovella, Mar 18 '21 at 20:14
@RobertCrovella: IMHO the text there is a bit obscure. I would rather we edited that question and answer to focus more on dynamic shared memory; reopened this one; and answered it to focus on static shared memory. — einpoklum, Mar 19 '21 at 17:41
I've already updated the other answer to cover fairly explicitly both the static and dynamic case. — Robert Crovella, Mar 19 '21 at 17:46

Error when trying to use more than 48 KB of (static) shared memory

0 Answers0