what is the proper way to compile cuda with g++

Question

the code files as follow:

a.h

void warperFoo();

a.cu

//---------- a.cu ----------
#include <cuda.h>
#include <cuda_runtime.h>
#include <stdio.h>
#include "a.h"


__global__ void foo (void) {
  printf("calling from kernel foo: %d\n", threadIdx.x);
  // bar();
}

void warperFoo() {
    printf("calling from warperFoo\n");
    dim3 gdim(1,1,1);
    dim3 bdim(4,4,4);
    foo<<<gdim, bdim>>>();
}

main.cpp

#include <iostream>
#include <cuda_runtime_api.h>
#include "a.h"

using namespace std;


int main() {
    warperFoo();   
    return 0;
}

makefile

.PHONY: clean
all: a.o
    g++ -m64 -Wall a.o main.cpp -lcudart -L/usr/local/cuda-11.2/lib64/ -I/usr/local/cuda-11.2/include -lcudadevrt -lcuda

a.o:
    nvcc --gpu-architecture=sm_70 -ccbin /usr/bin/gcc -c a.cu
    
clean:
    rm -rf *.o a.out

make output

nvcc --gpu-architecture=sm_70 -ccbin /usr/bin/gcc -c a.cu
g++ -m64 -Wall a.o main.cpp -lcudart -L/usr/local/cuda-11.2/lib64/ -I/usr/local/cuda-11.2/include -lcudadevrt -lcuda

a.out output

calling from warperFoo

i want compile .cu with nvcc first and then compile c++ host code with g++.

it supposed to print "calling from kernel foo"...

SO why kernel didn't output?

1. add `cudaDeviceSynchronize();` after the kernel call. 2. please use [proper CUDA error checking](https://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api). If the `cudaDeviceSynchronize();` doesn't fix it, there could be any number of reasons that the kernel did not output. The error message will help. You might not have a GPU at all. if you have a GPU, it may not be matching your `sm_70` specification. Or your CUDA install might be broken (no driver, driver not properly installed, driver version/CUDA versioon mismatch, etc.) — Robert Crovella, Apr 09 '22 at 17:14
very appreciate for your reply. the problem occured because cudaDeviceSynchronize(). and i learned a very correct way to write a proper cuda code, such as calling cuda check error. i always forget it. — ijpq, Apr 09 '22 at 17:35

ijpq · Accepted Answer · 2022-04-14T08:32:25.590

-3

the problem occured because cudaDeviceSynchronize(). refer to similiar question, printf didn't work because host process have exited before kernel function executed.

edited Apr 14 '22 at 08:32

answered Apr 09 '22 at 17:37

ijpq

17
5