Missed Optimization: std::vector::pop_back() not qualifying destructor call?

Question

In an std::vector<T> the vector owns the allocated storage and it constructs Ts and destructs Ts. Regardless of T's class hierarchy, std::vector<T> knows that it has only created a T and thus when .pop_back() is called it only has to destroy a T (not some derived class of T). Take the following code:

#include <vector>

struct Bar {
    virtual ~Bar() noexcept = default;
};

struct FooOpen : Bar {
    int a;
};

struct FooFinal final : Bar {
    int a;
};

void popEm(std::vector<FooOpen>& v) {
    v.pop_back();
}

void popEm(std::vector<FooFinal>& v) {
    v.pop_back();
}

https://godbolt.org/z/G5ceGe6rq

The PopEm for FooFinal simply just reduces the vector's size by 1 (element). This makes sense. But PopEm for FooOpen calls the virtual destructor that the class got by extending Bar. Given that FooOpen is not final, if a normal delete fooOpen was called on a FooOpen* pointer, it would need to do the virtual destructor, but in the case of std::vector it knows that it only made a FooOpen and no derived class of it was constructed. Therefore, couldn't std::vector<FooOpen> treat the class as final and omit the call to the virtual destructor on the pop_back()?

No the compiler doen'st know the vector only will contain FooOpen's. Maybe it will be linking with some other component later that does insert a class derived from FooOpen. So your assumption is only valid for this snippet of code. With FooFinal the optimization can be done. — Pepijn Kramer, Aug 06 '22 at 07:27
@PepijnKramer How can it insert a class derived from `FooOpen`? The only possibility I can see is that a user may placement-new a derived object into the storage of a `FooOpen` element, which would at the very least depend on a lot of unspecified behavior, but I feel like should be library undefined behavior in the first place. — user17732522, Aug 06 '22 at 07:35
@PepijnKramer I think the point is that even if we have a million classes that inherit fooopen, none of them can ever be stored in the vector. The vector always constructs and destructs fooopens, nothing else. To make a guess at the question: the vector knows that this optimization could be done, but that doesn't mean the compiler knows it. Some complex code analysis would have to be done for the compiler to figure this out. I don't have much knowledge about the optimization techniques in use, but I imagine you'd need some special treatment for vector to make this happen. — Wutz, Aug 06 '22 at 07:39
@Wutz No complex analysis would be required. The specification says that `std::allocator_traits::destroy` is used to destroy the element, which for `std::allocator` as allocator simply means a (unqualified) destructor call. The standard library can detect and special case the container if `std::allocator` is used (they already do that for optimization of trivially copyable types) and then always use a qualified destructor call instead of `allocator_traits::destroy`, which will enforce a static dispatch even if the class has a virtual destructor. — user17732522, Aug 06 '22 at 07:43
Are you sure this is anything to do with `std::vector`? What happens if you simply do `{ volatile FooOpen x; }`? — Paul Sanders, Aug 06 '22 at 08:53
I was wondering if other things would have the same problem, for example: ``` void rst(std::optional& o) { o.reset(); } void rst(std::optional& o) { o.reset(); } ``` But that generates the same machine code on compiler explorer. So `std::optional` knows that it is _just_ a `FooOpen` and doesn't bother calling the virtual dtor. Overall, I think in the vector pop code, it needs to do `item->FooOpen::~FooOpen()` instead of `item->~FooOpen()` and thus it can call the more optimized dtor. — Harrison Metzger, Aug 06 '22 at 11:02
There is no need to treat `FooOpen` as `final`, since a `std::vector` can never contain an instance of a class derived from `FooOpen` - any add/copy of an instance of a derived class into the vector (e.g. by `push_back()`) will slice the object. Destroying each element will therefore call destructors of `FooOpen` then its bases (in that order). It doesn't even matter if the base destructor is virtual (since no object is destroyed via a pointer to base). How the compiler optimises those destructor calls (when it has visibility of them) is then a quality of implementation concern — Peter, Aug 06 '22 at 11:12
@UriRaz No, if you try to insert a derived class into the vector, you will just store a sliced copy of it and consequently also only `FooOpen`'s destructor will be called. That is the whole point. It is impossible to store any other type but exactly `FooOpen` in the vector. The vector interface simply does not allow anything else. — user17732522, Aug 06 '22 at 13:45
@HarrisonMetzger What looks strange is msvc and clang show similar results. I checked libstdc++ code but couldn't find any clue. I wrote my own [vector imitation](https://godbolt.org/z/WTTvEYcsr) and it looks the same. So it's either some missed optimisation occurring in 3 main compilers or the standard prevents it for some reason. Anyway it's good to know final can make such a difference. — Bartosz Charuza, Aug 15 '22 at 21:47
Don't be shy about reporting this to gcc's bugzilla and opening an issue in llvm's github. — Marc Glisse, Aug 18 '22 at 19:54

Bartosz Charuza · Answer 1 · 2022-08-18T20:12:51.187

Long story short - compiler doesn't have enough context information to deduce it https://godbolt.org/z/roq7sYdvT

Boring part:

The results are similar for all 3: msvc, clang, and gcc, so I guess the problem is general. I analysed the libstdc++ code just to find pop_back() runs like this:

void pop_back() // a bit more convoluted but boils-down to this
{
    --back;
    back->~T();
}

Not any surprise. It's like in C++ textbooks. But it shows the problem - virtual call to a destructor from a pointer. What we're looking for is the 'devirtualisation' technique described here: Is final used for optimisation in C++ - it states devirtualisation is 'as-if' behaviour, so it looks like it is open for optimisation if the compiler has enough information to do it.

My opinion:

I meddled with the code a little and i think optimisation doesn't happen because the compiler cannot deduce the only objects pointed by "back" are FooOpen instances. We - humans - know it because we analyse the entire class, and see the overall concept of storing the elements in a vector. We know the pointer must point to FooOpen instance only, but compiler fails to see it - it only sees a pointer which can point anywhere (vector allocates uninitialized chunk of memory and its interpretation is a part of vector's logic, also the pointer is modified outside the scope of pop_back()). Without knowing the entire concept of vector<> i don't think of how it can be deduced (without analysing the entire class) that it won't point to any descendant of FooOpen which can be defined in other translation units.

FooFinal doesn't have this problem because it already guarantees no other class can inherit from it so devirtualisation is safe for objects pointed by FooFinal* or FooFinal&.

Update I made several findings which may be useful:

https://godbolt.org/z/3a1bvax4o), devirtualisation can occur for non-final classes as long as there is no pointer arithmetic involved.
https://godbolt.org/z/xTdshfK7v std::array performs devirtualisation on non-final classes. std::vector fails to do it even if it is constructed and destroyed in the same scope.
https://godbolt.org/z/GvoaKc9Kz devirtualisation can be enabled using wrapper.
https://godbolt.org/z/bTosvG658 destructor devirtualisation can be enabled with allocator. Bit hacky, but is transparent to the user. Briefly tested.

Good analysis. This seems a bit silly, because at this point `T` is a complete type, so in case of `FooFinal` optimisation is applied, but on the other hand, even though the whole implementation of `std::vector` is also visible to the compiler, it does not understand that only objects of type `T` (and not derived from `T`) are constructed. — Krzysiek Karbowiak, Aug 18 '22 at 10:05

score 2 · Answer 2 · answered Aug 15 '22 at 11:03

2

Yes, this is a missed optimisation.

Remember that a compiler is a software project, where features have to be written to exist. It may be that the relative overhead of virtual destruction in cases like this is low enough that adding this in hasn't been a priority for the gcc team so far.

It is an open-source project, so you could submit a patch that adds this in.

answered Aug 15 '22 at 11:03

Caleth

52,200
2
44
75

Additionally the low priority can also be that they don't see enough use-cases to justify it. Having an object with a virtual destructor directly in a vector seems like a code smell - I would normally expect some kind of pointer-wrapper in that case, or that the destructor shouldn't be virtual. Obviously there can be exceptions, but perhaps not sufficiently many. – Hans Olsson Aug 18 '22 at 09:04
@HansOlsson It's a smell to have the bottom of your inheritance hierarchy in a vector, but every class in that hierarchy has a virtual destructor. Unless you want to zealously enforce marking as much as possible `final` it seems fine for `vector` to exist – Caleth Aug 18 '22 at 09:10
@Caleth Would you care to explain why exactly it is a smell to have the bottom of hierarchy in a vector? – Krzysiek Karbowiak Aug 18 '22 at 10:08
@KrzysiekKarbowiak for the reason Hans suggests: you have something intended to be polymorphic, inserting into the vector will slice. – Caleth Aug 18 '22 at 10:22
@Caleth If it is **bottom** of hierarchy it cannot slice. And in case of slicing, I would call it a serious programmer error, not a code-smell. – Krzysiek Karbowiak Aug 18 '22 at 10:35
@KrzysiekKarbowiak by bottom I mean the base class, not the most-derived – Caleth Aug 18 '22 at 10:54
@Caleth Doh. In that case I agree. – Krzysiek Karbowiak Aug 18 '22 at 11:48

score 1 · Answer 3 · answered Aug 18 '22 at 15:20

It feels a lot like § 11.4.7 (14) gives some insight into this. As of latest working draft (N4910 Post-Winter 2022 C++ working draft, Mar. 2022):

After executing the body of the destructor and destroying any objects with automatic storage duration allocated within the body, a destructor for class X calls the destructors for X’s direct non-variant non-static data members, the destructors for X’s non-virtual direct base classes and, if X is the most derived class (11.9.3), its destructor calls the destructors for X’s virtual base classes. All destructors are called as if they were referenced with a qualified name, that is, ignoring any possible virtual overriding destructors in more derived classes. Bases and members are destroyed in the reverse order of the completion of their constructor (see 11.9.3). [Note 4 : A return statement (8.7.4) in a destructor might not directly return to the caller; before transferring control to the caller, the destructors for the members and bases are called. — end note] Destructors for elements of an array are called in reverse order of their construction (see 11.9).

Also interesting for this topic, § 11.4.6, (17):

In an explicit destructor call, the destructor is specified by a ~ followed by a type-name or decltype-specifier that denotes the destructor’s class type. The invocation of a destructor is subject to the usual rules for member functions (11.4.2); that is, if the object is not of the destructor’s class type and not of a class derived from the destructor’s class type (including when the destructor is invoked via a null pointer value), the program has undefined behavior.

So, as far as the standard cares, the invocation of a destructor is subject to the usual rules for member functions.

This, to me, sounds a lot like destructor calls do so much that compilers are likely unable to determine, at compile-time, that a destructor call does "nothing" - as it also calls destructors of members, and std::vector doesn't know this.

Missed Optimization: std::vector::pop_back() not qualifying destructor call?

3 Answers3

Linked