1

I've ported a cuda project from linux to windows (basically just added few defines and typedefs in the header file). I'm using visual studio 2008, and the cuda runtime api custom build rules from the SDK. The code is c, not c++ (and I'm compiling /TC not /TP)

I'm having scope issues that I didn't have in linux. Global variables in my header file aren't shared between the .c files and .cu files.

I've created a simplified project, and here is all of the code:

main.h:

#ifndef MAIN_H
#define MAIN_H

#include <stdio.h>
#include <cuda.h>
#include <cuda_runtime.h>

cudaEvent_t cudaEventStart;

#if defined __cplusplus
extern "C" void func(void);
#else
extern void func(void);
#endif

#endif

main.c:

#include "main.h"

int main(void)
{
    int iDevice = 0;

    cudaSetDevice(iDevice);
    cudaFree(0);
    cudaGetDevice(&iDevice);
    printf("device: %d\n", iDevice);

    cudaEventCreate(&cudaEventStart);
    printf("create event: %d\n", (int) cudaEventStart);

    func();

    cudaEventDestroy(cudaEventStart);
    printf("destroy event: %d\n", (int) cudaEventStart);

    return cudaThreadExit();
}

kernel.cu:

#include "main.h"

void func()
{
    printf("event in cu: %d\n", (int) cudaEventStart);
}

output:

device: 0
create event: 44199920
event in cu: 0
event destroy: 441999920

Any ideas about what I am doing wrong here? How do I need to change my setup so that it works in visual studio? Ideally, I'd like a setup that works multi-platform.

CUDA 3.2, GTX 480, 64-bit Win7, 263.06 general

jmilloy
  • 7,875
  • 11
  • 53
  • 86

2 Answers2

2

What you are trying to do

  1. Would not work even without CUDA -- try renaming kernel.cu to kernel.c and recompile. You will get a linker error because cudaEventStart will be multiply defined -- in each compilation unit (.c file) that includes it. You would need to make the variable static, and initialize it in only one compilation unit.
  2. Compiles in CUDA because CUDA does not have a linker, and therefore code in compilation units compiled by nvcc (.cu files) cannot reference symbols in other compilation units. CUDA doesn't support static global variables currently. In the future CUDA will have a linker, but currently it does not.

What is happening is each compilation unit is getting its own, non-conflicting instance of cudaEventStart.

What you can do is get rid of the global variable (make it a local variable in main()), add cudaEvent_t parameters to the functions that need to use the event, and then pass the event variable around.

BTW, in your second post, you have circular #includes...

harrism
  • 26,505
  • 2
  • 57
  • 88
  • @harrism thanks for this. you are implying that this setup will not work no matter what, yet it works in my linux version. secondly, don't the include guards prevent the circular include? in fact, in my real project the header has include guards as well. – jmilloy May 17 '11 at 03:02
  • @harrism is it possible that the include guards are working differently in linux and visual studio? Perhaps in visual studio, nvcc can't see the defines from cl.exe and vice versa, resulting in two instances of cudaEventStart. Whereas in linux, the guard works across both compilers, preventing the double instances of all my global variables? – jmilloy May 17 '11 at 03:07
  • I am not implying, I am stating. :) You can't share global variables between CUDA compilation units and C++ compilation units in this way currently. If you rename kernel.cu to kernel.c as I suggested, you *will* get a linker error like: "ld: duplicate symbol _foo in ..." from g++. As for the circular include, it may work, but it is not a good programming practice. Also, note that include guards don't work across compilation units in any compiler, they only prevent the same header being included twice into the same compilation unit. Each .c or .cu file is a separate compilation unit. – harrism May 17 '11 at 03:22
  • @harrism haha now we're being nitpicky, but i like that. When you say it `would not work even without CUDA` then to me you are implying that it does not work _with_ CUDA. Which it, in fact, does. But only in linux. (Also, I'm not using g++ because this is in c). – jmilloy May 17 '11 at 03:28
  • @harrism yeah, what you added about not working across compilation units is what I was trying to say, too, so I get what you mean. I am wondering very strongly why it works fine in linux. – jmilloy May 17 '11 at 03:31
  • The fact is it is incorrect code as written -- you have a multiply defined symbol. – harrism May 17 '11 at 03:53
  • @harrism Can we get past the "code as written" and back to "what I'm trying to do"? (I'm sorry I forgot to type the include guard in the header). What I am trying to do is define a global variable in a header and use it in multiple source files. The task is complicated by the fact that the source files require different compilers. However, this is something that DOES work in linux. This is something I have done in a previous CUDA project that I do not have access to anymore. It's possible that it *should* never work and I got (un)lucky, but so far I'm not convinced of that. – jmilloy May 17 '11 at 04:32
  • @jmilloy To do what you want requires that the global variables be static (as I said in my answer), and initialized in a single compilation unit (.c file). If the variables are not static, you will get multiply defined symbol errors when you #include the header in multiple .c files -- I just tested on g++ on Linux and verified this claim. – harrism May 17 '11 at 04:45
  • @harrism okay I can try making them static tomorrow morning. It's interesting that you get multiply defined symbol errors and I don't. – jmilloy May 17 '11 at 05:02
  • 1
    You might want to look at [this answer](http://stackoverflow.com/questions/5370413/multiple-defined-symbols-c-error) which explains why non-static global variables in headers are bad. It may also give you the idea that you can declare the variables `extern`, but unfortunately as I explained that won't work for CUDA device code since it doesn't have a linker. It *might* work for host code in a .cu file. – harrism May 17 '11 at 06:18
  • @harrism so declaring the global variables as static doesn't solve the problem, and you're right that externs don't seem to help. it's driving me crazy that it works fine with gcc and nvcc in linux, though - any thoughts on that yet? – jmilloy May 17 '11 at 15:38
  • @harrism aha i got the sample code working with proper externing, thanks for that link – jmilloy May 17 '11 at 15:43
0

I modified my simplified example (with success) by including the .cu file in the header and removing the forward declarations of the .cu file function.

main.h:

#include <stdio.h>
#include <cuda.h>
#include <cuda_runtime.h>

#include "kernel.cu"

cudaEvent_t cudaEventStart;

main.c:

#include "main.h"

int main(void)
{
    int iDevice = 0;

    cudaSetDevice(iDevice);
    cudaFree(0);
    cudaGetDevice(&iDevice);
    printf("device: %d\n", iDevice);

    cudaEventCreate(&cudaEventStart);
    printf("create event: %d\n", (int) cudaEventStart);

    func();

    cudaEventDestroy(cudaEventStart);
    printf("destroy event: %d\n", (int) cudaEventStart);

    return cudaThreadExit();
}

kernel.cu:

#ifndef KERNEL_CU
#define KERNEL_CU

#include "main.h"

void func(void);

void func()
{
    printf("event in cu: %d\n", (int) cudaEventStart);
}

#endif

output:

device: 0
create event: 42784024
event in cu: 42784024
event destroy: 42784024

About to see if it works in my real project, and whether the solution is portable back to linux.

jmilloy
  • 7,875
  • 11
  • 53
  • 86
  • And survey says, nope... after many variations, the .cu file in my real project doesn't compile. `blockDim undefined idenfifier`, `__syncthreads undefined`, etc – jmilloy May 16 '11 at 22:48