23

I ran into a rather subtle bug when using std::minmax with structured bindings. It appears that passed rvalues will not always be copied as one might expect. Originally I was using a T operator[]() const on a custom container, but it seems to be the same with a literal integer.

#include <algorithm>
#include <cstdio>
#include <tuple>

int main()
{
    auto [amin, amax] = std::minmax(3, 6);
    printf("%d,%d\n", amin, amax); // undefined,undefined

    int bmin, bmax;
    std::tie(bmin, bmax) = std::minmax(3, 6);
    printf("%d,%d\n", bmin, bmax); // 3,6
}

Using GCC 8.1.1 with -O1 -Wuninitialized will result in 0,0 being printed as first line and:

warning: ‘<anonymous>’ is used uninitialized in this function [-Wuninitialized]

Clang 6.0.1 at -O2 will also give a wrong first result with no warning.

At -O0 GCC gives a correct result and no warning. For clang the result appears to be correct at -O1 or -O0.

Should not the first and second line be equivalent in the sense that the rvalue is still valid for being copied?

Also, why does this depend on the optimization level? Particularly I was surprised that GCC issues no warning.

Zulan
  • 21,896
  • 6
  • 49
  • 109
  • 1
    `auto [amin, amax]`? Is this new syntax in C++17? – R Sahu Jul 24 '18 at 16:06
  • 2
    @RSahu yes it is called [structured-bindings] – Zulan Jul 24 '18 at 16:06
  • This is probably related to https://stackoverflow.com/questions/36555544/observing-weird-behavior-with-auto-and-stdminmax - but still different due to the combination with structured bindings. – Zulan Jul 24 '18 at 16:07
  • I am also wondering why for -O0 have no problem and -O2 is undefined, I came across a similar bug in our release version which is -O2 compiled, and my debug version with -O0 didn't find this issue, this is tricky:( – BruceSun Dec 12 '21 at 14:34

2 Answers2

13

What's important to note in auto [amin, amax] is that the auto, auto& and so forth are applied on the made up object e that is initialized with the return value of std::minmax, which is a pair. It's essentially this:

auto e = std::minmax(3, 6);

auto&& amin = std::get<0>(e);
auto&& amax = std::get<1>(e);

The actual types of amin and amax are references that refer to whatever std::get<0> and std::get<1> return for that pair object. And they themselves return references to objects long gone!

When you use std::tie, you are doing assignment to existing objects (passed by reference). The rvalues don't need to live longer than the assignment expressions in which they come into being.


As a work around, you can use something like this (not production quality) function:

template<typename T1, typename T2>
auto as_value(std::pair<T1, T2> in) {
    using U1 = std::decay_t<T1>;
    using U2 = std::decay_t<T2>;
    return std::pair<U1, U2>(in);
}

It ensures the pair holds value types. When used like this:

auto [amin, amax] = as_value(std::minmax(3, 6));

We now get a copy made, and the structured bindings refer to those copies.

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
  • Is there a correct way to use `auto` in this instance or is `std::tie` the only option? – ptb Jul 24 '18 at 16:44
  • @ptb - You don't really have control about how the types of your structured bindings are inferred. There's a wall of standardese that describes how it should behave, depending on the functions exact return type. But nothing in the declaration of the bindings affects that much. The issue here, is that `std::minmax` essentially returns an object of a reference type (kinda like `string_view`). And that reference doesn't refer to valid things anymore. I edited in a workaround that turns it into a value type. – StoryTeller - Unslander Monica Jul 24 '18 at 16:57
  • 2
    possible solution `auto [amin,amax] = std::minmax( { 3, 6 } );` – Slava Jul 24 '18 at 17:03
  • @Slava - Also possible. Bu now the arguments are copied into the function as well, as opposed to only being copied once on the way out. Not really all that important for ints, but it's something to consider. – StoryTeller - Unslander Monica Jul 24 '18 at 17:05
  • 1
    Yea, anyway this is ugly, I am afraid pretty soon it will be very difficult to write a line code without unpredictable side effects in C++. – Slava Jul 24 '18 at 17:07
  • I did not even know `std::minmax` returns pairs of references. I mean const references. That can dangle. Who designed this again? And why is the `std::initializer_list` overload helpfully avoiding dangles but not this one? – Passer By Jul 24 '18 at 17:15
  • @PasserBy - The initializer list overload has to return copies, because (depending on which revision you ask for the specific reason) the storage for the list cannot be relied to exist. This is set out to mimic `std::min` and `std::max`. When Stepanov came up with them, he only had const lvalue references to support rvalues. It's not terrible C++98. I don't know how far into the standardization of C++11 it was when those were added. – StoryTeller - Unslander Monica Jul 24 '18 at 17:21
  • 1
    Still sounds like pretty poor decision making. I don't see how it's reasonable to for people to know `auto p = std::minmax(3, 6);` might kill their kitten. It's also easily fixable after the fact, just add a rvalue reference overload and return by value. – Passer By Jul 24 '18 at 17:28
  • @PasserBy - You got me there. I'm not sure if any existing code can be affected by this in an averse manner, but it's definitely possible to alter. – StoryTeller - Unslander Monica Jul 24 '18 at 17:30
  • Thanks @StoryTeller. This was very helpful. – ptb Jul 24 '18 at 17:33
  • Thanks, this is very helpful. Do you have any insight why the results in practice depend on the optimization level - particularly considering that GCC is generally capable of warning about this. – Zulan Jul 24 '18 at 20:26
  • @Zulan - I'm afraid I'm not too familiar with the inner workings of GCC's and Clang's optimizers to tell you why that happens. UB can become evident in the compiler's behavior as well. – StoryTeller - Unslander Monica Jul 24 '18 at 20:31
8

There are two fundamental issues going on here:

  1. min, max, and minmax for historic reasons return references. So if you pass in a temporary, you'd better take the result by value or immediately use it, otherwise you get a dangling reference. If minmax gave you a pair<int, int> here instead of a pair<int const&, int const&>, you wouldn't have any problems.
  2. auto decays top-level cv-qualifiers and strips references, but it doesn't remove all the way down. Here, you're deducing that pair<int const&, int const&>, but if we had deduced pair<int, int>, we would again not have any problems.

(1) is a much easier problem to solve than (2): write your own functions to take everything by value:

template <typename T>
std::pair<T, T> minmax(T a, T b) {
    return (b < a) ? std::pair(b, a) : std::pair(a, b);
}

auto [amin, amax] = minmax(3, 6); // no problems

The nice thing about taking everything by value is that you never have to worry about hidden dangling references, because there aren't any. And the vast majority of uses of these functions are using integral types anyway, so there's no benefit to references.

And when you do need references, for when you're comparing expensive-to-copy objects... well, it's easier to take a function that takes values and force it to use references than it is to take a function that uses references and try to fix it:

auto [lo, hi] = minmax(std::ref(big1), std::ref(big2)); 

Additionally, it's very visible here at the call site that we're using references, so it would be much more obvious if we messed up.


While the above works for lots of types due to reference_wrapper<T>'s implicit conversion to T&, it won't work for those types that have non-member, non-friend, operator templates (like std::string). So you'd additionally need to write a specialization for reference wrappers, unfortunately.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • `minmax(std::ref(big1), std::ref(big2));` i guess the minmax function would deduce its arguments as `std::reference_wrapper` here. but i cant find an `operator <` for the reference_wrapper nor a specialization for minmax. how can it work? – phön Jul 25 '18 at 11:49
  • @phön `reference_wrapper` is implicitly convertible to `T&`, which covers just about all cases (except for non-member, non-friend, operator templates, which unfortunately `std::string` has). Would need a specialization for completeness. – Barry Jul 25 '18 at 12:09
  • It's implicitly convertible to `T&`, but `minmax` will not deduce `T`, it will deduce `std::reference_wrapper`. And there is no `operator<` for `std::reference_wrapper`, or am i wrong? (i tried it and it compiled. but how? if i put my own wrapper into `minmax`, this will not work because there is no `operator <` for my wrapper: `struct wrapper { int i; operator int&() { return i; } };` – phön Jul 25 '18 at 12:18
  • @phön There is no `operator<`, but there is a `operator T&`. – Barry Jul 25 '18 at 12:19
  • aah i forgot about the const in minmax. But nevertheless i didnt know that operator < (and i guess all other operators ;-P ) will "try out" all implicit conversions from both involved types. thank you for your insight! – phön Jul 25 '18 at 12:35