4

Take the following:

#include <vector>
#include <string>
#include <type_traits>

int main()
{
    std::vector<std::string> vec{"foo", "bar"};
    for (auto& el : vec)
       el.std::string::~string();

    auto& aliased = reinterpret_cast<
       std::vector<
          std::aligned_storage_t<sizeof(std::string), alignof(std::string)>
       >&>(vec);
    aliased.clear();
}

(whittled down from more complex code, of course — we wouldn't generally manage a simple std::string vector this way in such a simple testcase)

Does this program have undefined behaviour? I thought that we cannot alias vector<T1> as vector<T2>, even if T1 and T2 are compatible.

And if so, can this be expected to have practical ramifications at runtime?

Assume strict aliasing is not disabled in the compiler.

Interestingly, GCC 9.2.0 doesn't give me any warnings with -fstrict-aliasing -Wstrict-aliasing (live demo).

Asteroids With Wings
  • 17,071
  • 2
  • 21
  • 35
  • 7
    "Interestingly, GCC 9.2.0 doesn't give me any warnings" - that's not really strange. Whenever you use a cast (*especially* a `reinterpret_cast`) you are basically telling the compiler "I know what I'm doing. Just shut up and trust me" - it's then *entirely* on you to get it right - you explicitly burned all safety nets. – Jesper Juhl Feb 27 '20 at 16:47
  • @JesperJuhl Sure, and I realise a warning isn't guaranteed, but since `Wstrict-aliasing` exists and sometimes gives results, I found it potentially interesting that it didn't in this case. I couldn't be sure whether that was solely because I'd misinterpreted the rules and that this in fact _wasn't_ a violation. – Asteroids With Wings Feb 27 '20 at 16:48
  • @AsteroidsWithWings A `reinterpret_cast` is a very effective way of suppressing almost all of the compilers diagnostics (and also comes with *very few* guarantees and is *extremely hard* to use correctly. It's *dangerous* and should hardly ever be used). – Jesper Juhl Feb 27 '20 at 16:51
  • Yep, my bad. Just a strict aliasing viloation, no double destroy. – NathanOliver Feb 27 '20 at 16:55
  • Of course, in practice, we should indeed be prepared for such an eventuality, purely due to the UB, right? The compiler can and very well may choose to omit this operation entirely, purely due to the UB, right? (Though then we're sort of into a discussion about whether UB is ever "acceptable") – Asteroids With Wings Feb 27 '20 at 16:56
  • @AsteroidsWithWings I can't see a compiler just ignoring the code but it could. It's why I advise no UB, unless you really know what the compiler will do (like we all know all reasonable compilers allow type punning through a union even if the standard says it's UB) – NathanOliver Feb 27 '20 at 16:59

3 Answers3

9

Does this program have undefined behaviour?

Absolutely. You are accessing an object of type vector<string> through a reference to some unrelated type. That's UB, a violation of the strict aliasing rule.

And if so, can this be expected to have practical ramifications at runtime?

UB means that the behavior of the runtime is undefined. So yeah.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Can you expand on whether you would, in practice, actually expect a compiler to "make use" of this aliasing violation and end up eliding the `clear()`? Is that something that has been observed? (Note that I'm not suggesting we'd want to rely on it) – Asteroids With Wings Feb 27 '20 at 16:57
  • 3
    @AsteroidsWithWings: You tagged your question as "language-lawyer". That means you are interested in the behavior of C++ in accord with the C++ standard, not how it might hypothetically behave on some unbounded and unknowable series of compilers. – Nicol Bolas Feb 27 '20 at 16:58
  • 2
    @AsteroidsWithWings also strongly consider that there are infamous cases of UB code that "just worked as expected" until a compiler update. It happened to the linux kernel, it will happen to you. Don't rely on Undefined Behavior, even if it "just happens to work now on your system" – bolov Feb 27 '20 at 17:26
  • @bolov: Well, there are times where you *have* to rely on UB just to do a low-level thing (or because some API is dumb). But those are nothing like the case outlined above, where there is zero advantage to doing this. Defensible cases of this sort of thing involve trivial types and bit-fiddling and such, not serious objects that represent allocated storage. – Nicol Bolas Feb 27 '20 at 17:30
  • 2
    @AsteroidsWithWings The point about strict aliasing rules is, that they allow memory accesses to be reordered more freely. Your compiler may decide to reorder such simple things as `vec[42] = "foo"; cout << aliased[42];`, and it is actually likely to do so. After all, the memory load `aliased[42]` takes time, so it should be scheduled early, and strict aliasing rules mean that `vec[42] = "foo"` does not influence the result of the load. Or does it? – cmaster - reinstate monica Feb 27 '20 at 17:31
  • @cmaster-reinstatemonica There's the practical reasoning I was looking for – Asteroids With Wings Feb 27 '20 at 17:38
1

I think walnut has roughly the correct answer.

Technically, accessing an object of class type is broken down in one or more accesses of scalar objects in the class. All the scalar accesses individually must obey the aliasing rules.

This is important for C compatibility. In C, struct Foo { int a; } and struct Bar { int b; } may alias eachother.

In this case, it's Undefined Behavior because the standard does not define the scalar members of std::string or std::vector. std::aligned_storage_t might be an array of scalars, but that's not guaranteed either.

MSalters
  • 173,980
  • 10
  • 155
  • 350
1

My conclusion from the answers and comments so far (some of which have been deleted ☹️) is that, although strict aliasing is defined in terms of accesses and therefore only applies to scalar types (which the current draft makes more clear), the program in question is still assuredly a case of undefined behaviour:

[class.mfct.non-static]/2: If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.

And that, therefore, even if the compiler decides not to remove the clear() statement (or otherwise do "weird" things), we could not guarantee that the layout and operation of the target vector specialisation matches that of the original type, and that this pattern should therefore be avoided in production code in general.

So, whichever way you spin it, this is problematic in principle, and potentially problematic in practice too.

(All references are to n4659, which is basically C++17.)

Asteroids With Wings
  • 17,071
  • 2
  • 21
  • 35