37

I've just started CUDA programming and it's going quite nicely, my GPUs are recognized and everything. I've partially set up Intellisense in Visual Studio using this extremely helpful guide here: http://www.ademiller.com/blogs/tech/2010/10/visual-studio-2010-adding-intellisense-support-for-cuda-c/

and here: http://www.ademiller.com/blogs/tech/2011/05/visual-studio-2010-and-cuda-easier-with-rc2/

However, Intellisense still doesn't pick up on kernel calls like this:

// KernelCall.cu
#include <iostream>
#include "cuda.h"
#include "cuda_runtime.h"
#include "device_launch_parameters.h"

__global__ void kernel(void){}

int main()
{
    kernel<<<1,1>>>();

    system("pause");
    return 0;
}

The line kernel<<<1,1>>>() is underlined in red, specifically the one arrow to the left of the first one with the error reading "Error: expected and expression". However, if I hover over the function, its return type and parameters are displayed properly. It still compiles just fine, I'm just wondering how to get rid of this little annoyance.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
sj755
  • 3,944
  • 14
  • 59
  • 79

5 Answers5

41

Wow, lots of dust on this thread. I came up with a macro fix (well, more like workaround...) for this that I thought I would share:

// nvcc does not seem to like variadic macros, so we have to define
// one for each kernel parameter list:
#ifdef __CUDACC__
#define KERNEL_ARGS2(grid, block) <<< grid, block >>>
#define KERNEL_ARGS3(grid, block, sh_mem) <<< grid, block, sh_mem >>>
#define KERNEL_ARGS4(grid, block, sh_mem, stream) <<< grid, block, sh_mem, stream >>>
#else
#define KERNEL_ARGS2(grid, block)
#define KERNEL_ARGS3(grid, block, sh_mem)
#define KERNEL_ARGS4(grid, block, sh_mem, stream)
#endif

// Now launch your kernel using the appropriate macro:
kernel KERNEL_ARGS2(dim3(nBlockCount), dim3(nThreadCount)) (param1); 

I prefer this method because for some reason I always lose the '<<<' in my code, but the macro gets some help via syntax coloring :).

Pedro77
  • 5,176
  • 7
  • 61
  • 91
Randy
  • 427
  • 1
  • 4
  • 2
  • Thanks for your great solution. In addition, not "#end" but "#endif" is right. sorry for tiny little things. – onion mk2 Dec 08 '17 at 22:06
  • 3
    It seems that `__INTELLISENSE__` is safer choice than `__CUDACC__` check https://stackoverflow.com/a/6182137/8037585 and https://devtalk.nvidia.com/default/topic/513485/cuda-programming-and-performance/__syncthreads-is-undefined-need-a-help/ – BugKiller Aug 22 '18 at 19:06
  • This is a helpful solution but extremely ugly. Hopefully Microsoft modifies Intellisense so that solutions like this one are no longer necessary – Michael Apr 06 '20 at 21:42
24

Visual Studio provides IntelliSense for C++, the trick from the rocket scientist's blog is basically relying on the similarity CUDA-C has to C++, nothing more.

In the C++ language, the proper parsing of angle brackets is troublesome. You've got < as less than and for templates, and << as shift, remember not long ago when we had to put a space in between nested template declarations.

So it turns out that the guy at NVIDIA who came up with this syntax was not a language expert, and happened to choose the worst possible delimiter, then tripled it, well, you're going to have trouble. It's amazing that Intellisense works at all when it sees this.

The only way I know to get full IntelliSense in CUDA is to switch from the Runtime API to the Driver API. The C++ is just C++, and the CUDA is still (sort of) C++, there is no <<<>>> badness for the language parsing to have to work around.

kfb
  • 6,252
  • 6
  • 40
  • 51
Brian Kretzler
  • 9,748
  • 1
  • 31
  • 28
  • 4
    I must say, you make more sense than anyone on the NVIDIA forums... So how do I go about doing what you just said? – sj755 May 20 '11 at 04:05
  • 1
    Take a look at the difference between matrixMul and matrixMulDrv. The <<<>>> syntax is handled by the compiler essentially just spitting out code that calls the Driver API calls. You'll link to cuda.lib not cudart.lib, and may have to deal with a "mixed mode" program if you use CUDA-RT only libraries. – Brian Kretzler May 20 '11 at 14:41
  • 1
    See my reply to your question on my blog. Spider is correct. It's unlikely that this will ever work correctly. – Ade Miller May 20 '11 at 14:53
  • Thanks for the help, I mostly wanted to know if this was expected behavior and this essentially puts my mind at ease. Thanks again. – sj755 May 20 '11 at 15:28
  • 1
    Update: 2021. Visual Studio 2019 does fairly well if you #include "cuda_runtime.h" and add the CUDA includes to your include path. On my machine it comes out to be C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\include – Richard Keene Feb 03 '21 at 19:15
  • Update 2021: Also #include does even more. – Richard Keene Feb 03 '21 at 21:07
  • @Richard-Keene what do you mean by "does fairly well in 2021? There still is the red underline for the kernel call and the "expected an expression" error message in Vistual Studio 2019 – ray_ray_ray Apr 25 '21 at 08:16
  • Yes, still the red underline problem, and <<< and >>> get turned into << < >> > by formatting. – Richard Keene Apr 26 '21 at 16:07
11

From VS 2015 and CUDA 7 onwards you can add these two includes before any others, provided your files have the .cu extension:

#include "cuda_runtime.h"
#include "device_launch_parameters.h"

No need for MACROS or anything. Afterwards everything will work perfectly.

KeyC0de
  • 4,728
  • 8
  • 44
  • 68
4

I LOVED Randy's solution. I'll match and raise using C preprocessor variadic macros:

#ifdef __INTELLISENSE__
#define CUDA_KERNEL(...)
#else
#define CUDA_KERNEL(...) <<< __VA_ARGS__ >>>
#endif

Usage examples:

my_kernel1 CUDA_KERNEL(NUM_BLOCKS, BLOCK_WIDTH)();
my_kernel2 CUDA_KERNEL(NUM_BLOCKS, BLOCK_WIDTH, SHMEM, STREAM)(param1, param2);
Nir
  • 69
  • 5
-1

I've been learning CUDA and have encountered that exact issue. As others have said, it's just Intellisense problem and can be ignored, but I've found a clean solution which actually removes it.

It seems that <<< >>> is interpreted as a correct code if it's inside a template function.

I've discovered it accidentally when I wanted to create wrappers for kernels to be able to call them from a regular cpp code. It's both a nice abstraction and removes the syntax error.

kernel header file (eg. kernel.cuh)

const size_t THREADS_IN_BLOCK = 1024;

typedef double numeric_t;

// sample kernel function headers
__global__ void sumKernel(numeric_t* out, numeric_t* f, numeric_t* blockSum, size_t N);
__global__ void expKernel(numeric_t* out, numeric_t* in, size_t N);
// ..

// strong-typed wrapper for a kernel with 4 arguments
template <typename T1, typename T2, typename T3, typename T4>
void runKernel(void (*fun)(T1, T2, T3, T4), int Blocks, T1 arg1, T2 arg2, T3 arg3, T4 arg4) { 
    fun <<<Blocks, THREADS_IN_BLOCK >>> (arg1, arg2, arg3, arg4);
}

// strong-typed wrapper for a kernel with 3 arguments
template <typename T1, typename T2, typename T3>
void runKernel(void (*fun)(T1, T2, T3), int Blocks, T1 arg1, T2 arg2, T3 arg3) { 
    fun <<<Blocks, THREADS_IN_BLOCK >>> (arg1, arg2, arg3);
}

// ...

// the one-argument fun cannot have implementation here
void runKernel(void (*fun)(), int Blocks);

in a .cu file (you will get a syntax error here, but do u ever need a parameter-less kernel function? if not, this and a respective header can be deleted):

void runKernel(void (*fun)(), int Blocks) { 
    fun <<<Blocks, THREADS_IN_BLOCK >>> ();
}

usage in a .cpp file:

runKernel(kernelFunctionName, arg1, arg2, arg3);
// for example runKernel(expKernel, B, output, input, size);
Lutosław
  • 606
  • 7
  • 24