2

This question comes from a CUDA context, where you have multiple compilers in use so bear with me - it presents the situation (otherwise you might ask why this problem would come up or suggest things that would be wastes of time). Just about every C++ 03 or 11 rule applies depending on the compile flags, I believe there are a few exceptions but let's pretend there are none as this contrived setup could be replicated without nvcc.

dvector.h:

//parsable by both nvidia's cuda compiler (nvcc) and good old gcc and kin.
//gcc compiled code is expected to link up and call subroutines compiled with cuda for gpu accceleration    
template<typename T>
struct dvector{// a vector for manipulating GPU memory
    size length;
    T *data;
    void fill(const T &val);

};

#if __CUDACC__ //if being parsed by nvidia' compiler
template<typename T>
void dvector<T>::fill(const T &val){
   //do some nvcc-only stuff, gcc wouldn't understand
}
#endif

Suppose we have 2 object files we are compiling, both include the header containing this declaration and conditionally defined definition.

  • srcfile.cpp is compiled with gcc
  • srcfile.cu is compiled with nvcc
  • srcfile.h usable with both nvcc and gcc

srcfile.h:

#include "dvector.h"
void call_work_kernel(dvector<int> &v);

srcfile.cu:

#include "srcfile.h"

template<typename T>
__global__ void work_kernel(device_vector<T> v){//purposely by value... this this is the way to pass it
    const int tid = threadIdx.x + blockDim.x + blockIdx.x;
    if(tid < threadIdx){
        v.data[tid] += 1;
    }
}

void call_work_kernel(dvector<int> &v){
     const size_t blocksize = 32;
     const size_t n_blocks = (v.length + blocksize - 1) / blocksize;
     dim3 blockDims(n_blocks);
     dim3 blockSize(blocksize);
     work_kernel<<<blockDims, blockSize>>>(v);

}

srcfile.cpp:

int main(){
    device_vector<int>::fill(42);
    call_work_kernel(v);
}

So playing by the standard, will the implicit instantiation of dvector<int> instantiate member function dvector<int>::fill? Does this still happen with the dvector<int>& vs a routine taking dvector<int>?

The answer here quotes what looks to be a draft standard, although I don't have a copy so I can't tell for sure.

Community
  • 1
  • 1
Jason Newton
  • 1,201
  • 9
  • 13
  • FYI [Where do I find the current C or C++ standard documents?](http://stackoverflow.com/q/81656) – Shafik Yaghmour Oct 08 '15 at 01:49
  • Obviously, the instantiation can't be done by `gcc`, because it never has the definition of `dvector::fil()` visible. To be clear, the implicit instantiation you're referring to is `dvector` as named in the prototype of `call_work_kernel` which is seen by `nvcc` in a context in which the definition of `dvector::fill()` is visible. And you want to know if the standard says nvcc *must* instantiate this member, *may* instantiate it, or *must not*? – Phil Miller Oct 08 '15 at 02:07
  • @Novelocrat not as the named prototype but in trivial usage (accessing the data member, maybe calling other non-fill functions, had they existed) in the call_work_kernel definition . I want to know if this minimal usage and a standards compliant compiler would be required to instantiate this member. – Jason Newton Oct 08 '15 at 02:14
  • Could you please edit to show examples of sort of accesses/calls in the definition of `call_work_kernel` you have in mind? – Phil Miller Oct 08 '15 at 02:24
  • @novelocrat ok I've introduced more cuda into it to get you a typical example, basically assume fill is not going to be explicitly instantiated or called in the cu file. – Jason Newton Oct 08 '15 at 03:56

0 Answers0