If the separate compilation units that are fed as input to nvlink
contain cuda kernels and device functions that invoke device functions marked as __forceinline__
, will these functions be inlined? Assume they would be inlined if one put all the source code into a single file.
Asked
Active
Viewed 123 times
0

user1823664
- 1,071
- 9
- 16
1 Answers
1
If the separate compilation units that are fed as input to nvlink contain cuda kernels and device functions that invoke device functions marked as
__forceinline__
, will these functions be inlined?
To the best of my knowledge, the CUDA device code linker can't do this. The __forceinline__
directive is a compiler level operation, and after compilation there is no way of marking code as inlineable in either PTX or SASS. The CUDA device code compiler should emit a warning that an external inline function was used but not defined if you try this.
If you want functions to be compiled inline, you have to (unsurprisingly) use a compiler, not a linker.

talonmies
- 70,661
- 34
- 192
- 269