2

The example:

#include <optional>
#include <iostream>

using namespace std;

int main()
{
    optional<int> t{}; // nullopt (empty) by default

    cout << *t << endl;

    return 0;
}

Actually this program prints some int (uninitialized value of type int). Also, libcxx uses assert-check for accessing non-engaged value.

Why the Standard does not require throwing or sigsegv here?

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
vladon
  • 8,158
  • 2
  • 47
  • 91
  • The same reason as for `std::vector::operator[]` – Jonathan Wakely Nov 07 '18 at 18:42
  • @JonathanWakely `std::vector::operator[]` can sigsegv, but optional will never sigsegv. – vladon Nov 07 '18 at 18:43
  • 7
    "sigsegv" isn't part of the C++ standard. The consistent theme here is "undefined behaviour". – Kerrek SB Nov 07 '18 at 18:44
  • 4
    @vladon They are both undefined behavior, they both might sigsegv, they might both throw, they might both do anything at all. – François Andrieux Nov 07 '18 at 18:44
  • 1
    SIGSEGV is the signal raised when you access memory outside your address space, a disengaged `optional` object is obviously not outside your address space, it just doesn't contain an initialized value. Why should it cause a segfault? – Jonathan Wakely Nov 07 '18 at 18:47
  • @JonathanWakely "not outside your address space" - it's a detail of implementation. I'm asking why standard does not require anything. – vladon Nov 07 '18 at 18:48
  • 2
    Requiring a hard error is a trade-off between performance and usability. Requiring a well defined error requires that the implementation performs a check. When faced with these dilemmas, c++ usually goes with the decision that avoids imposing overhead. – François Andrieux Nov 07 '18 at 18:51
  • 3
    Undefined Behaviour does not mandate a specific result. *Anything* can happen. Just don't write bugs is the only advice that can be given. – Jesper Juhl Nov 07 '18 at 18:59
  • If it is required to throw, `t.value()` should be used. https://en.cppreference.com/w/cpp/utility/optional/value – Robert Andrzejuk Nov 08 '18 at 17:38

3 Answers3

10

Why the Standard does not require throwing or sigsegv here?

Because requiring some particular behaviour implicitly imposes the requirement to add a branch to check whether that behaviour - be it throwing or something else - should occur.

By specifying that the behaviour is undefined, the standard allows the implementation to not check whether optional is empty upon every indirection. Branching the execution is potentially slower than not branching.

Rather than mandating safety, the committee let the standard library implementers to choose performance (and simplicity). The implementation that you tested seems to have chosen to not throw an exception or otherwise inform you of the mistake.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • 5
    _"The committee chose performance over safety."_ The committee chose to leave it up to the std::lib implementation to decide, and the implementation might choose performance in some configurations and safe in others (e.g. when assertions are enabled). – Jonathan Wakely Nov 07 '18 at 18:50
  • 2
    It's more that the committee chooses not the impose overhead if the same result can be achieved by the user with a manual test of their own if they desire it. It's not about letting the implementation choose what solution they want to go with, implementers can add whatever safety checks they want provided it doesn't violate any other part of the standard (like complexity requirements (but even then some do for debug builds)). – François Andrieux Nov 07 '18 at 18:52
8

C++ embraces the idea of undefined behavior.

Not all C++ operations have behavior defined by the standard. This permits compilers to assume they never happen, and can result in much faster code in many cases.

Here, by leaving the result of using a std::optional that is unengaged undefined, it the cost of accessing data stored in a std::optional is the same as the cost of accessing data not stored in a std::optional. The only costs are the extra room required, and you as a programmer promising to keep track of if it is engaged or not.

Now compilers are free to insert checks there, and some do in debug builds.

Note that usually C++ std library types include safe and unsafe methods for accessing data.

The fact that invalid pointers sometimes result in a sigsev is because most OS's protect addresses around 0 and crash programs that access it. This is because this was low cost, and it catches a bunch of bad behavior from many assembly, C and C++ programs.

If you want optional to throw when empty, use .value(). If you don't, use operator*. If you want a default value if one isn't there, use .value_or.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • 3
    Not just around address 0. Whenever you cross a page boundary you'll usually get a fault if the page is not mapped. Not fatal if the page is in your address space (the OS will just map it in) but otherwise fatal. – Jesper Juhl Nov 07 '18 at 19:05
  • 1
    @jesp sure, but barring garbage pointer data, pointee that once pointed somewhere don't give pagefaults usually. – Yakk - Adam Nevraumont Nov 07 '18 at 19:20
6

Because it is undefined behavior, section [optional.observe]p5 says:

Requires: *this contains a value.

and violating a requires clause is undefined behavior, from [res.on.required#1]p1 which is under Library-wide requirements:

Violation of any preconditions specified in a function's Requires: element results in undefined behavior unless the function's Throws: element specifies throwing an exception when the precondition is violated.

So you have no expecation as to the result. From the definition of undefined behavior:

behavior for which this document imposes no requirements

Requiring the implementation to check would be a cost and not all users would want to take that cost. So this becomes a quality of implementation issue. An implementation is free to performing checks in different modes of operation for example when assertions are enabled.

The user has the option of taking the cost themselves via has_value or value_or. If the user wants an operation that can throw they can use value.

Note that sigsegv, segfaults etc... are an implementation defined behavior.

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740