0

If the separate compilation units that are fed as input to nvlink contain cuda kernels and device functions that invoke device functions marked as __forceinline__, will these functions be inlined? Assume they would be inlined if one put all the source code into a single file.

user1823664
  • 1,071
  • 9
  • 16

1 Answers1

1

If the separate compilation units that are fed as input to nvlink contain cuda kernels and device functions that invoke device functions marked as __forceinline__, will these functions be inlined?

To the best of my knowledge, the CUDA device code linker can't do this. The __forceinline__ directive is a compiler level operation, and after compilation there is no way of marking code as inlineable in either PTX or SASS. The CUDA device code compiler should emit a warning that an external inline function was used but not defined if you try this.

If you want functions to be compiled inline, you have to (unsurprisingly) use a compiler, not a linker.

talonmies
  • 70,661
  • 34
  • 192
  • 269