AMD HIP executable with undefined symbols during runtime

Question

I want to use the AMD HIP framework for my self-written GPU kernels. I do that by using a third-party library, which takes responsibility of taking the code and compiling it with HIP (and additional backends if desired). The technical setup looks as follows:

The kernel code is compiled into a static helper library with AMD HIP linking and toolchain enabled (CMake: set_target_properties(${target_name} PROPERTIES LINKER_LANGUAGE HIP))
This helper library is then linked into the core part of our own library, which is a shared library
This core part is then linked into the final shared library which is shipped at the end

Therefore, we have 3 different libraries in the build process that are linked together. The build process exits without any errors s.t. that during compile- and link-time there are no errors. However, when I now want to use this library, I get the following error during runtime: undefined symbol: __hip_fatbin.

Because the code used to not even link correctly, I added these two flags to CMake which made it build successfully (as suggested by others on GitHub): -fgpu-rdc --hip-link. However, the library still does not run because of this undefined symbol error during execution. Inspecting the created libraries with nm -gD shows U in front of __hip_fatbin which makes me wonder why it is like that. Shouldn't that be somehow defined when linking with the HIP toolchain?

So my question is if anybody experienced the same issue yet when trying to use AMD HIP across multiple libraries that are linked against each other. Might this be an issue with gcc and HIP's clang? Or is there any chance for me to get further details which makes me understand what to do now. Thank you!

AMD HIP executable with undefined symbols during runtime

0 Answers0