My goal is to generate call graphs using CMake + Clang + GraphViz at build time.
Using these [1, 2] processes I can create simple graphs. But, I'm not sure how to generalise the process to a CMake project.
I have an executable target.
add_executable(${TARGET} ${SOURCES})
Which from within a macro, I add the graph relevant options to the target:
target_compile_options(${TARGET} PRIVATE -S -emit-llvm)
And, add an addtional post build command which generates the call graphs:
add_custom_command(
TARGET ${TARGET}
POST_BUILD
COMMENT "Running clang OPT"
COMMAND opt -analyze -dot-callgraph
)
But the clang attempts to create an executable for the target. This results in this error:
[build] lld-link: error:
Container.test.cpp.obj: unknown file type
I also don't understand how any custom command (opt
for example) would access the produced LLVM representation. It doesn't look like my custom command has any knowledge of the relevant files (even if the above error was fixed).
What I understand so far:
- CMake
add_executable
adds the-o outfile.exe
argument to clang, this prevents me from doing the same steps shown in the linked processes [1, 2] $<TARGET_FILE:${TARGET}>
can be used to find the produced files from clang, but I don't know if this works for LLVM representation.- I've tried doing a custom target instead, but had issues getting all the
TARGET
sources with all the settings into the custom target. - The process outlined here [3] might be relevant specially
-Wl,-save-temps
but this seems to be a pretty roundabout way to get IR (using llvm-dis). - The
unknown file type
error is due to the object actually beingLLVM
representation, but I suspect the linker expects a different format. - To get the linker to understand
LLVM
representation, add-flto
to the linker optionstarget_link_options(${TARGET} PRIVATE -flto)
, (source [4]). This is awesome, because it means I've almost solved this... I just don't know how to get the path to the produced bitcode output files in cmake, once I do, I can pass them to opt (I hope...). - To get the target objects the following cmake command can be used
$<TARGET_OBJECTS:${TARGET}>
in the case of cmake this will list the.o
(Is the.o
because of a rename by cmake?) LLVM bitcode files. - The
.o
file in this case is bitcode, however theopt
tool appears to only a llvm representation. To convert to thisllvm-dis bitcode.bc –o llvm_asm.ll
. Due to cross compilation I believe the mangled symbol are of a strange format. Passing them intollvm-cxxfilt
does not succeed, for examplellvm-cxxfilt --no-strip-underscore --types ?streamReconstructedExpression@?$BinaryExpr@AEBV?$reverse_iterator@PEBD@std@@AEBV12@@Catch@@EEBAXAEAV?$basic_ostream@DU?$char_traits@D@std@@@std@@@Z
- So addressing 8. this is a MSVC name mangling format. This indicates that when compiling on windows clang uses the MSVC format name mangling. A surprise to me... (source [5]).
- LLVM ships with
llvm-undname
it is able to demangle the symbols. This tool when I run it errors significantly when I give it raw input, it seems to only work with correct symbols. The tooldemumble
appears to be a cross platform, multi-format wrapper of llvm-undname and llvm-cxxfilt.
11.My almost working cmake macro is as follows:
macro (add_clang_callgraph TARGET)
if(CALLGRAPH)
target_compile_options(${TARGET} PRIVATE -emit-llvm)
target_link_options(${TARGET} PRIVATE -flto)
foreach (FILE $<TARGET_OBJECTS:${TARGET}>)
add_custom_command(
TARGET ${TARGET}
POST_BUILD
COMMAND llvm-dis ${FILE}
COMMAND opt -dot-callgraph ${FILE}.ll
COMMAND demumble ${FILE}.ll.callgraph.dot > ${FILE}.dot
)
endforeach()
endif()
endmacro()
However, this doesn't work... The contents of ${FILE}
is always the entire list...
This is still the case here:
foreach (FILE IN LISTS $<TARGET_OBJECTS:${TARGET}>)
add_custom_command(
TARGET ${TARGET}
POST_BUILD
COMMAND echo ${FILE}
)
endforeach()
The result looks like:
thinga.obj;thingb.obj
This is because CMake doesn't evaluate the generator expression until AFTER the for loop is evaluated. Meaning, there is only one loop here and it contains the generator expression (not a resolved generator expression) (source [6]). This means I cannot loop through object files and create a series of custom commands for each object file.
I'll add to the above as I find things out, If I figure out the whole process I'll post a solution.
Any help would be greatly appreciated, this has been a great pain in the arse.
What I'm hoping for, a way to make CMake accept building an executable to a single LLVM representation file, using that file with opt to get the callgraph and then finishing the compilation with llc
.
I'm a little constrained though, as I'm cross compiling. Ultimately anything equivlient will do...