0

I am having problem with running LLVM custom passes on the ".ll" file of a CUDA code. For example, I have a CUDA sample code named sample1.cu. I have compiled the CUDA code using the following command.

./bin/clang++ -flegacy-pass-manager -g -Xclang -disable-O0-optnone -load -Xclang ./lib/custom_pass.so --cuda-gpu-arch=sm_35 -L/usr/local/cuda-11.7/targets/x86_64-linux/lib/ -L/usr/local/cuda --no-cuda-version-check -lcudart_static -ldl -lrt -pthread -o /home/soumik/Documents/Executables/

However, upon dumping the .ll files, I got two LLVM IR files, instead of one. One was simple1.ll, but the other one was simple1-cuda-nvptx64-nvidia-cuda-sm_75.ll. if I run my passes on the simple1.ll it is going to miss the information store in the other ll file. How to overcome this problem?

I am running my pass using the following command:

./bin/opt --mem2reg --enable-new-pm=0 -load lib/sample_pass.so -sample_pass  /home/soumik/Documents/sample_analysis/Test_Programs/simple1.ll

I have tried to link the two files into one ll file using the following command:

./bin/llvm-link simple1-cuda-nvptx64-nvidia-cuda-sm_35.ll simple1.ll -o simple1_link.ll

But it generates the following warning:

target datalayout, target triple mismatched

Moreover when ever I try to run the custom pass on the simple1_link.ll file, it produces a runtime error saying

'sm_75' is not a recognized processor for this target (ignoring processor) LLVM ERROR: 64-bit code requested on a subtarget that doesn't support it!

Can anyone please help me understand where am I wrong?

Tauro
  • 59
  • 7
  • 1
    Written by someone who never uses Clang for the GPU: You probably are getting two .ll files because the compiler is splitting host code and device code. At that point you have two different ll for two different architectures, so trying to link them or optimize both of them doesn't make much sense. Presumably you want to run a custom pass on the GPU. code If that case just run the pass on the GPU code – talonmies Jul 23 '23 at 04:59

0 Answers0