30
#include <cstdio>
#include <cstdlib>
struct Interface {
    virtual void f() = 0;
};

struct Impl1: Interface {
    void f() override {
        std::puts("foo");
    }
};

// or __attribute__ ((visibility ("hidden")))/anonymous namespace
static Interface* const ptr = new Impl1 ;

int main() {
    ptr->f();
}

When compiled with g++-7 -O3 -flto -fdevirtualize-at-ltrans -fipa-pta -fuse-linker-plugin, the above ptr->f() call cannot be devirtualized.

It seems that no external library can modify ptr. Is this a deficiency of GCC optimizer, or because some other sources make devirtualization unavailable in this case?

Godbolt link

UPDATE: It seems that clang-7 with -flto -O3 -fwhole-program-vtables -fvisibility=hidden is the only compiler+flags (as in 2018/03) that can devirtualize this program.

lz96
  • 2,816
  • 2
  • 28
  • 46
  • 4
    What is the *actual* problem you want to solve? Why are you doing this? Perhaps there's better or other solutions to the actual problem? And please take some time to [read about the XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). – Some programmer dude Feb 21 '18 at 12:50
  • 1
    What happens if you put `ptr` in an anonymous namespace? – Bathsheba Feb 21 '18 at 12:52
  • @Bathsheba Tried on Godbolt. Still virtual call. – lz96 Feb 21 '18 at 12:54
  • 1
    Weird. Would have thought gcc would have applied -fdevirtualize-at-ltrans in that case: it being simple standard C++. Nice question. – Bathsheba Feb 21 '18 at 12:55
  • @lz96: Permission to change your code so it's standard C++? – Bathsheba Feb 21 '18 at 13:00
  • Same behavior with `Interface* const ptr = new Impl1;`. – aschepler Feb 21 '18 at 13:13
  • 1
    Using anonymous namespace like this seem to enable devirtualization: https://godbolt.org/g/exzQrC – Ville Krumlinde Feb 21 '18 at 14:30
  • @VilleKrumlinde For non-`lto` case, [this slides](http://llvm.org/devmtg/2016-11/Slides/Padlewski-DevirtualizationInLLVM.pdf) explains such behavior well: Itanium ABI, by default, exposes vtable as PUBLIC, so for a public class its vtable may be modified during linking. However, this stll does not reason for cases where `-fvisibility=hidden -flto` is specified – lz96 Feb 21 '18 at 16:06

2 Answers2

11

If you move the ptr into the main function, the result is very telling and offers a strong hint as to why gcc doesn't want to de-virtualize the pointer.

The disassembly for this shows that if the 'has the static been initialized flag' is false, it initializes the static and then jumps right back to the virtual function call, even though nothing could possibly have happened to it in between.

This tells me that gcc is hard-wired to believe that any kind of globally persistent pointer must always be treated as a pointer to an unknown type.

In fact, it's even worse than this. If you add in a local variable, it matters whether the call to the f on the static pointer occurs between the creation of the local variable and the call to f or not. The assembly showing the f interposed case is here: Another godbolt link; and it is simple to re-arrange it yourself on the site to see how the assembly turns into an inline of f once the other call isn't interposed.

So, gcc must assume that the actual type a pointer refers to may change whenever control flow leaves the function for any reason. And whether or not it's declared const is irrelevant. Nor is it relevant if it's address is ever taken, or any number of other things.

clang does the same thing. This seems overly cautious to me, but I'm not a compiler writer.

Omnifarious
  • 54,333
  • 19
  • 131
  • 194
  • 2
    This should be a comment, not an answer. An answer should address the following questions in sufficient detail: Is the optimization legal? How hard would it be to implement the optimization in a compiler? Why does GCC not perform the optimization? Is it worth being implemented? – Hadi Brais Feb 28 '18 at 21:11
  • 4
    @HadiBrais - It's too big to be a comment. And the only people who could give you a real answer. I would suggest filing a bug report for missed optimization opportunity. I've filled those myself and they're taken seriously. – Omnifarious Mar 01 '18 at 06:04
  • 1
    Thanks for taking the time to file a bug report. I agree this is the only way to get an answer. You can update this answer based on that. – Hadi Brais Mar 01 '18 at 15:58
4

I just made an other experiment, as in Omnifarious answer. But this time I make the pointer point to a static object:

Impl1 x;
static Interface* const ptr = &x ;

GCC do devirtualize the function call, -O2 is sufficient. I don't see any rule in the C++ standard that would make pointer to static storage treated differently than pointer to dynamic storage.

It is allowed to change the object at the address pointed to by ptr. So the compiler must track what is happening at that address to know what is the actual dynamic type of the object. So my opinion is that optimizer implementer may have considered that tracking what is happening on the heap would be too difficult in real program, so the compiler just don't do it.

Oliv
  • 17,610
  • 1
  • 29
  • 72
  • This should be a comment, not an answer. An answer should address the following questions in sufficient detail: Is the optimization legal? How hard would it be to implement the optimization in a compiler? Why does GCC not perform the optimization? Is it worth being implemented? – Hadi Brais Feb 28 '18 at 21:12
  • 3
    @HadiBrais Honestly I hesitated to put it in a comment to the answer of Omnifarious, since its answer is invalidated by this experiment. On the other hand it would have been not an appropriate comment. I have taken the habit to put partial answer when I am able to demonstrate that the previous answer is lacking something while I am still not able to provide a perfect one. – Oliv Feb 28 '18 at 21:16
  • 1
    In general, you can post it as a comment under the question. If it's good enough, other people will tell you to promote it to an answer. – Hadi Brais Feb 28 '18 at 21:19
  • 3
    @HadiBrais I find comment unreadable, I would prefer something like the discussion page of wikis, so that solutions could be found by a collaborative effort. (that is a bottle in the sea). – Oliv Feb 28 '18 at 21:28
  • 1
    There are two techniques here to reach a good/better answer collaboratively: 1- by discussing the question in a comments section 2- by suggesting edits to an existing answer. If it was a very interesting question, it can be posted on Reddit or Hacker News to motivate a lengthy discussion and involve a wider audience. – Hadi Brais Feb 28 '18 at 21:36
  • @HadiBrais When you say "there are two techniques" you mean "there are only two"? What means techniques? ... What means answer? Of what is made the border between a complete or incomplete answer? Could I touch it? – Oliv Feb 28 '18 at 22:23
  • Partial or wrong answers are still answers. That's fine. This particular answer (and Omnifarious' answer for that matter) is neither a partial answer nor a wrong one, it's really a comment that might help others to reach an answer. Consider your answer. The first part shows a case where devirtualization occurs. This is just a general observation. The second part states a vague guess that critically lacks technical details to back it up, which makes it almost meaningless. If your answer was wrong, I would've downvoted it. But it's not an answer so I just flagged it as an NAA. – Hadi Brais Feb 28 '18 at 22:49
  • @HadiBrais The second part answer the question "Why could the optimizer have difficulties to make the optimization?" Answer: because it is allowed to replace the object at x by another one, check [basic.life] if you want to know how. You did not notice it because you may had no knowledge about this. The opinionic part starts when I say "in my opinion". This part is the result of an abductive reasoning. The answer to my previous comment question is: the border is in your head. Predicates all apply to your own representation of the "world". Read Charles Pierce. Good luck. – Oliv Mar 01 '18 at 07:27
  • The difference between your code and the OP is that in your code, everything can be done at compile time and the `static` can actually be put into constant storage. The OPs code requires that code be run when the program starts up. – Omnifarious Mar 01 '18 at 15:06
  • @Omnifarious I have doubt about our two answers, please look at [this assembly](https://godbolt.org/g/UJbp73). Why `ptr0->f()` devirtualized and not `ptr3->f()`? – Oliv Mar 01 '18 at 16:03