0

I am trying to compile and run the following program called test.cu:

#include <iostream>
#include <math.h>
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
// Kernel function to add the elements of two arrays
__global__
void add(int n, float* x, float* y)
{
    int index = threadIdx.x;
    int stride = blockDim.x;
    for (int i = index; i < n; i += stride)
        y[i] = x[i] + y[i];
}

int main(void)
{
    int N = 1 << 20;
    float* x, * y;

    // Allocate Unified Memory – accessible from CPU or GPU
    cudaMallocManaged(&x, N * sizeof(float));
    cudaMallocManaged(&y, N * sizeof(float));

    // initialize x and y arrays on the host
    for (int i = 0; i < N; i++) {
        x[i] = 2.0f;
        y[i] = 1.0f;
    }

    // Run kernel on 1M elements on the GPU
    add <<<1, 256>>> (N, x, y);

    // Wait for GPU to finish before accessing on host
    cudaDeviceSynchronize();

    // Check for errors (all values should be 3.0f)
    for (int i = 0; i < 10; i++)
        std::cout << y[i] << std::endl;

    // Free memory
    cudaFree(x);
    cudaFree(y);

    return 0;
}

I am using visual studio comunity 2019 and it marks the "add <<<1, 256>>> (N, x, y);" line as having an expected an expression error. I tried compiling it and somehow it compiles without mistakes, but when running the .exe file it outputs a bunch of "1" instead of the expected "3".

I also tried compiling using "nvcc test.cu", but initially it said "nvcc fatal : Cannot find compiler 'cl.exe' in PATH", so i added "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\Hostx64\x64" to path and now compiling with nvcc gives the same mistake as compiling with visual studio.

In both cases the program never enter the "add" function.

I am pretty sure the code is right and the problem has something to do with the installation, but i already tried reinstalling cuda toolkit and repairing MCVS, but it didn't work.

The kernel.cu exemple that appears when starting a new project with cuda in visual studio also didn't work. When running it outputted "No kernel image available for execution on the device".

How can is solve this?

nvcc version if that helps:

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:35_Pacific_Daylight_Time_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.relgpu_drvr445TC445_37.28845127_0
Bruno Jambeiro
  • 156
  • 1
  • 9
  • 4
    There's nothing wrong with your code. If you are having trouble with it I suggest using [proper CUDA error checking](https://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api). Regarding the "expected an expression error", that is coming from VS intellisense, and is not an actual error in the code. Intellisense doesn't understand CUDA syntax. If you would like more information on this common topic, you could google "CUDA red underline" for lots of information. – Robert Crovella Aug 19 '20 at 15:00
  • https://godbolt.org/z/r6xKcG -- the code compiles. It seems you are conflating an intellisense error with a runtime problem (maybe a broken CUDA installation). – talonmies Aug 19 '20 at 15:02
  • It's unfortunate that VS doesn't switch off IntelliSense for a language it doesn't understand, but keeps screaming "I DON'T UNDERSTAND!" as loud as it can instead. Perhaps the behaviour was put there to make it seem more human. – molbdnilo Aug 19 '20 at 15:09
  • As others have said, this is not an error. VS has a system to detect errors preemtively, so they are caught before compilation, called IntelliSense. Simply, IntelliSense supports C++ but not CUDA, so all CUDA code looks weird to it. So VS marking something as wrong does not mean it wont compile. – Ander Biguri Aug 19 '20 at 15:51
  • This is 100% an error. The program is not outputting what it was supposed and the program never enters the "add" function – Bruno Jambeiro Aug 20 '20 at 00:40

1 Answers1

1

Visual Studio provides IntelliSense for C++. In the C++ language, the proper parsing of angle brackets is troublesome. You've got < as less than and for templates, and << as shift. So, the fact is that the guys at NVIDIA choose the worst possible delimiter <<<>>>. This makes Intellisense difficult to work properly. The way to get full IntelliSense in CUDA is to switch from the Runtime API to the Driver API. The C++ is just C++, and the CUDA is still (sort of) C++, there is no <<<>>> badness for the language parsing to have to work around.

You could take a look at the difference between matrixMul and matrixMulDrv. The <<<>>> syntax is handled by the compiler essentially just spitting out code that calls the Driver API calls. You'll link to cuda.lib not cudart.lib, and may have to deal with a "mixed mode" program if you use CUDA-RT only libraries. You could refer to this link for more information.

Also, this link tells how to add Intellisense for CUDA in VS.

Barrnet Chou
  • 1,738
  • 1
  • 4
  • 7