3

Question

(Can I get clang or perhaps some other optimizing tool shipped with LLVM to identify unused virtual functions in a C++ program, to mark them for dead code elimination? I guess not.)

If there is no such functionality shipped with LLVM, how would one go about implementing a thing like this? What's the most appropriate layer to achieve this, and where can I find examples on which I could build this?

Thoughts

My first thought was an optimizer working on LLVM bitcode or IR. After all, a lot of optimizers are written for that representation. Simple dead code elimination is easy enough: any function which is neither called nor has its address taken and stored somewhere is dead code and can be omitted from the final binary. But a virtual function has its address taken and stored in the virtual function table of the corresponding class. In order to identify whether that function has a chance of getting called, an optimizer would not only have to identify all virtual function calls, but also identify the type hierarchy to map these virtual function calls to all possible implementations.

This makes things look quite hard to tackle at the bitcode level. It might be better to handle this somewhere closer to the front end, at a stage where more type information is available, and where calls to a virtual function might be more readily associated with implementations of these functions. Perhaps the VirtualCallChecker could serve as a starting point.

One problem is probably the fact that while it's possible to combine the bitcode of several objects into a single unit for link time optimization, one hardly ever compiles all the source code of a moderately sized project as a single translation unit. So the association between virtual function calls and implementations might have to be somehow maintained till that stage. I don't know if any kind of custom annotation is possible with LLVM; I have seen no indication of this in the language specification.

But I'm having a bit of a trouble with the language specification in any case. The only reference to virtual in there are the virtuality and virtualIndex properties of MDSubprogram, but so far I have found no information at all about their semantics. No documentation, nor any useful places inside the LLVM source code. I might be looking at the wrong documentation for my use case.

Cross references

eliminate unused virtual functions asked about pretty much the same thing in the context of GCC, but I'm specifically looking for a LLVM solution here. There used to be a -fvtable-gc switch to GCC, but apparently it was too buggy and got punted, and clang doesn't support it either.

Example:

struct foo {
  virtual ~foo() { }
  virtual int a() { return 12345001; }
  virtual int b() { return 12345002; }
};

struct bar : public foo {
  virtual ~bar() { }
  virtual int a() { return 12345003; }
  virtual int b() { return 12345004; }
};

int main(int argc, char** argv) {
  foo* p = (argc & 1 ? new foo() : new bar());
  int res = p->a();
  delete p;
  return res;
};

How can I write a tool to automatically get rid of foo::b() and bar::b() in the generated code? clang++ -fuse-ld=gold -O3 -flto with clang 3.5.1 wasn't enough, as an objdump -d -C of the resulting executable showed.

Question focus changed

Originally I had been asking not only about how to use clang or LLVM to this effect, but possibly for third party tools to achieve the same if clang and LLVM were not up to the task. Questions asking for tools are frowned upon here, though, so by now the focus has shifted from finding a tool to writing one. I guess chances for finding one are slim in any case, since a web search revealed no hints in that direction.

Community
  • 1
  • 1
MvG
  • 57,380
  • 22
  • 148
  • 276
  • You first need a way to tell the compiler you never load additional code at runtime (`dlopen`/`dlsym` or `LoadLibrary`/`GetProcAddress`) – Ben Voigt Apr 09 '15 at 20:38
  • @BenVoigt: Well, such a tool would obviously make such an assumption. `-fwhole-program` is something similar, and if I understand it correctly, `-flto` will break in a `dlopen` setup as well. I don't expect the compiler to make such an optimization automatically, but I'd like some tool, automatic or otherwise, which can make such an optimization when explicitely requested. – MvG Apr 09 '15 at 20:56
  • 1
    _"Since this question has already received 3 votes to close as off topic, please keep in mind that “software tools commonly used by programmers” are explicitely and officially on topic. So if you vote to close, I'd welcome an explanation in a comment."_ Yeah, questions about tools. Not questions asking for tools. I don't need to give you an explanation because the help centre is quite clear. – Lightness Races in Orbit Apr 09 '15 at 21:06
  • @LightningRacisinObrit: thanks, that does help. I somehow hope that someone answers by “well, simply add this switch to clang” or “simply call that optimizer from the llvm suite” in which case this feels like “how can I get the llvm suite to do what I want”. But I'm not sure if that hope is justified, which is why the question doesn't sound like a question about how to use standard llvm tools but instead asking for the existence of perhaps more exotic tools. I can understand your point, though. And I wonder whether I should rephrase things along the lines of working with llvm tools. – MvG Apr 09 '15 at 21:13
  • @MvG: Yeah that would be better :) – Lightness Races in Orbit Apr 09 '15 at 22:05
  • @MvG did you ever make any progress on this? – Stuart K Nov 28 '17 at 17:58
  • 1
    @StuartK: No progress; without good pointers I did not follow up on this. – MvG Nov 29 '17 at 12:26

0 Answers0