Question
(Can I get clang or perhaps some other optimizing tool shipped with LLVM to identify unused virtual functions in a C++ program, to mark them for dead code elimination? I guess not.)
If there is no such functionality shipped with LLVM, how would one go about implementing a thing like this? What's the most appropriate layer to achieve this, and where can I find examples on which I could build this?
Thoughts
My first thought was an optimizer working on LLVM bitcode or IR. After all, a lot of optimizers are written for that representation. Simple dead code elimination is easy enough: any function which is neither called nor has its address taken and stored somewhere is dead code and can be omitted from the final binary. But a virtual function has its address taken and stored in the virtual function table of the corresponding class. In order to identify whether that function has a chance of getting called, an optimizer would not only have to identify all virtual function calls, but also identify the type hierarchy to map these virtual function calls to all possible implementations.
This makes things look quite hard to tackle at the bitcode level. It might be better to handle this somewhere closer to the front end, at a stage where more type information is available, and where calls to a virtual function might be more readily associated with implementations of these functions. Perhaps the VirtualCallChecker could serve as a starting point.
One problem is probably the fact that while it's possible to combine the bitcode of several objects into a single unit for link time optimization, one hardly ever compiles all the source code of a moderately sized project as a single translation unit. So the association between virtual function calls and implementations might have to be somehow maintained till that stage. I don't know if any kind of custom annotation is possible with LLVM; I have seen no indication of this in the language specification.
But I'm having a bit of a trouble with the language specification in any case. The only reference to virtual
in there are the virtuality
and virtualIndex
properties of MDSubprogram, but so far I have found no information at all about their semantics. No documentation, nor any useful places inside the LLVM source code. I might be looking at the wrong documentation for my use case.
Cross references
eliminate unused virtual functions asked about pretty much the same thing in the context of GCC, but I'm specifically looking for a LLVM solution here. There used to be a -fvtable-gc
switch to GCC, but apparently it was too buggy and got punted, and clang doesn't support it either.
Example:
struct foo {
virtual ~foo() { }
virtual int a() { return 12345001; }
virtual int b() { return 12345002; }
};
struct bar : public foo {
virtual ~bar() { }
virtual int a() { return 12345003; }
virtual int b() { return 12345004; }
};
int main(int argc, char** argv) {
foo* p = (argc & 1 ? new foo() : new bar());
int res = p->a();
delete p;
return res;
};
How can I write a tool to automatically get rid of foo::b()
and bar::b()
in the generated code?
clang++ -fuse-ld=gold -O3 -flto
with clang 3.5.1 wasn't enough, as an objdump -d -C
of the resulting executable showed.
Question focus changed
Originally I had been asking not only about how to use clang or LLVM to this effect, but possibly for third party tools to achieve the same if clang and LLVM were not up to the task. Questions asking for tools are frowned upon here, though, so by now the focus has shifted from finding a tool to writing one. I guess chances for finding one are slim in any case, since a web search revealed no hints in that direction.