25

According to cppreference std::get for variant throws std::bad_variant_access if the type contained in the variant is not the expected one. This means that the standard library has to check on every access (libc++).

What was the rationale for this decision? Why is it not undefined behavior, like everywhere else in C++? Can I work around it?

Praetorian
  • 106,671
  • 19
  • 240
  • 328
Denis Yaroshevskiy
  • 1,218
  • 11
  • 24
  • 1
    @Justin I do not think it is a true duplicate. There is no answer 'why'. Second of all, there is actually no answer for 'can I work around it'. I am nominating the question for reopening. – SergeyA Feb 15 '18 at 22:34
  • 3
    Because that's what `std::variant` is *for*: 'a type-safe union'. If you don't want it type-safe, or want UB, don't use it: use a `union`. – user207421 Feb 16 '18 at 00:00
  • 1
    [In this thread](https://groups.google.com/a/isocpp.org/forum/?fromgroups#!topic/std-proposals/DxvEBfamvZ0), some people give some motivation behind why `std::variant` might not have a `std::unchecked_get`. I don't know if that's really what was discussed in the standards meetings, but there is logic behind the reasoning – Justin Feb 16 '18 at 03:51
  • @MarquisofLorne Then half of the STL should be removed too, because there is UB everywhere and you can always implement it yourself. – Acorn Jun 12 '20 at 12:30

4 Answers4

15

The current API for std::variant has no unchecked version of std::get. I don't know why it was standardized that way; anything I say would just be guessing.

However, you can get close to the desired behavior by writing *std::get_if<T>(&variant). If variant doesn't hold T at that time, std::get_if<T> returns nullptr, so dereferencing it is undefined behavior. The compiler can thus assume that the variant holds T.


In practice, this isn't the easiest optimization for the compiler to do. Compared to a simple tagged union, the code it emits may not be as good. The following code:

int const& get_int(std::variant<int, std::string> const& variant)
{
    return *std::get_if<int>(&variant);
}

Emits this with clang 5.0.0:

get_int(std::variant<int, std::string> const&):
  xor eax, eax
  cmp dword ptr [rdi + 24], 0
  cmove rax, rdi
  ret

It is comparing the variant's index and conditionally moving the return value when the index is correct. Even though it would be UB for the index to be incorrect, clang is currently unable to optimize the comparison away.

Interestingly, returning an int instead of a reference optimizes the check away:

int get_int(std::variant<int, std::string> const& variant)
{
    return *std::get_if<int>(&variant);
}

Emits:

get_int(std::variant<int, std::string> const&):
  mov eax, dword ptr [rdi]
  ret

You could help the compiler by using __builtin_unreachable() or __assume, but gcc is currently the only compiler capable of removing the checks when you do so.

Justin
  • 24,288
  • 12
  • 92
  • 142
  • are you looking at the other decompiled overload of `get_int()` in [your own example](https://godbolt.org/g/A92D1L)? The variant-based one at disassembly line 1 does emit a tag check. The tagged union one does not. \n Also note Praetorian had a very good points about non-trivially-constructible types in the comments to [the other answer](https://stackoverflow.com/a/48817600/1149924). – kkm inactive - support strike Feb 15 '18 at 23:37
  • @kkm Yes. I put the tagged union there for comparison. It's clear that the tagged union emits better code, but in theory, this `std::get_if` method should work. The compilers have all the information they need. – Justin Feb 15 '18 at 23:42
  • @kkm Are you referring to my second assembly block? That's not in my posted example, but you can easily get it by changing the `int const&` to `int`. I'll add a link – Justin Feb 15 '18 at 23:54
  • Indeed the compiler does, and I also agree that the compiler assuming some invariants that make the program's behavior not undefined would not be unhelpful (and, as a side note, the general problem of inferring such invariants is likely undecidable :) ). \n Yes, adding a link would improve the readability and clarity of your answer IMO! – kkm inactive - support strike Feb 15 '18 at 23:57
  • Yes, this works and I like the *get_if trick, thanks. Real shame that we have to do it though. – Denis Yaroshevskiy Feb 16 '18 at 22:46
11

Why it's not undefined behavour, like everywhere else in c++? Can I work around it?

Yes, there is a direct workaround. If you do not want type safety, use a plain union instead of a std::variant. As it says in the reference you cited:

The class template std::variant represents a type-safe union.

The purpose of union was to have a single object that could take values from one of multiple different types. Only one type of the union was 'valid' at any given time depending on which member variables had been assigned:

union example {
   int i;
   float f;
};

// code block later...
example e;
e.i = 10;
std::cout << e.f << std::endl; // will compile but the output is undefined!

std::variant generalized a union while adding type safety to help make sure you are only accessing the right data type. If you do not want this safety, you can always use a union instead.

What was the rational for this decision?

I do not know personally what the rationale was for this decision, but you can always take a look at the papers from the C++ standardization committee to get some insight into the process.

Daniel
  • 1,291
  • 6
  • 15
  • 5
    Are you serious? If it were that simple to replace a `variant` with a `union`, why would the former exist? – Praetorian Feb 15 '18 at 22:54
  • 7
    I thought that was clear -- to create a type-safe version of `union`. When writing `union`s in C, I remember always pairing a `union` object with an `int` or `enum` so that I could store information about which type in the union was set. Otherwise, I would risk UB if I read from the wrong data member. Here, `std::variant` provides the error handling and type recall for a union-type object, so you don't have to implement that yourself (other than checking that the type is correct). – Daniel Feb 15 '18 at 23:02
  • I think there were some other interesting use cases for `union` depending on the data types contained which `std::variant` does not still handle, but I am not sure those parts about unions in the C standard were kept in the C++ standard. – Daniel Feb 15 '18 at 23:03
  • 3
    I do not understand the downvotes. `variant` is a `union` plus type safety; naturally then, `variant` minus type safety is a union. I must upvote this answer, but @DanielDay please fix the language a bit. You mean to say "if you do not want type safety, use plain union instead of variant", but you actually say the reverse in the first sentence. – kkm inactive - support strike Feb 15 '18 at 23:15
  • 6
    @kkm If all you care about is storing `int`s, `float`s etc. then this is an easy replacement to make. But what if one of the alternatives is `std::string`, or any other non-trivial type? You end up with something like [this](https://stackoverflow.com/q/30492927/241631). You also lose other nice features like visitation. Yes, you could write all of it yourself, but then that can be the answer to any question that asks for an alternate way of doing something. – Praetorian Feb 15 '18 at 23:20
  • 3
    @kkm: It is not that black and white; by switching to `union`, you give up a _lot_ besides just that type checking. However I agree with Daniel that this is at least a valid workaround if you're really desperate for it. – Lightness Races in Orbit Feb 15 '18 at 23:23
  • @Praetorian: yes, I indeed agree about the non trivially constructible type part. Entirely missed this point, thanks! – kkm inactive - support strike Feb 15 '18 at 23:29
  • @Praetorian this is true, although you could get around this limitation with non-trivial types by using a pointer to the non-trivial type. But you would then have to dynamically allocate memory for that type to use it within a `union` where `std::variant` can handle non-trivial types without additional allocation. I agree that the functionality is not a one to one match, but a `union` would be a work around for the OP's original question as stated – Daniel Feb 15 '18 at 23:42
  • @DanielDay, managing pointers in a variant type is not same as using types or even smart pointers. Too much non-trivial cleanup would be involved; this is certainly not salvaging the proposed solution. Since the reasoning goes this far, I should be resetting the upvote. – kkm inactive - support strike Feb 15 '18 at 23:51
  • I'm sorry, I must have not been clear enough. It's really hard to do variant with union - if you really need tags & staff. (consider for example computing the size of the required index type or visitation). What I meant was - I don't care much for exceptions, for me throw <=> terminate. I'm OK with UB instead of terminate if I messed up. I don't want an extra check when I'm out. I want get<...> with a wrong type to be a contruct violation. Consider: vector is a type safe dynamic array and optional is a type safe union - this doesn't mean that [] out of range or operator* cannot produce UB. – Denis Yaroshevskiy Feb 16 '18 at 22:51
2

What was the rationale for this decision?

This kind of question is always difficult to answer, but I'll give it a shot.

A lot of the inspiration for the behavior of std::variant came from the behavior of std::optional, as stated in the proposal for std::variant, P0088:

This proposal attempts to apply the lessons learned from optional...

And you can see the parallels between the two types:

  • You're not sure what's currently being held
    • in optional it's either a type or nothing (nullopt_t)
    • in variant it's either one of many types, or nothing (see valueless_by_exception)
  • All functions to operate on the type are marked constexpr
    • This may seem coincidental or just good design practices, but it was very clearly intended that variant follow optional's lead on this (see the linked proposal above)
  • They each provide a way to check for emptiness
    • std::optional has an implicit conversion to bool, or alternatively the has_value function
    • std::variant has valueless_by_exception which tells you if the variant is empty because constructing the active type threw an exception
  • They each provide a way for a throwing and non-throwing access
    • Potentially-throwing access for std::optional is value and it may throw bad_optional_access
    • Potentially-throwing access for std::variant is get and it may throw bad_variant_access
    • Non-throwing (I use the term a bit loosely) access for std::optional is value_or which may return you an alternative (that you pass in) if the optional is empty
    • Non-throwing access for std::variant is get_if which returns a nullptr if the index or type provided is bad.

Indeed the similarities were so intentional, that an inconsistency in the base classes used for optional and variant were cause for complaint (see this Google Groups discussion)

So to answer you question, it throws because optional throws. Bear in mind that the throwing behavior should be rarely encountered; you should use a visitor pattern with a variant, and even if you do call get it only throws if you provide it an index that is the size of the type list, or the requested type is not the active one. All other misuses are considered ill-formed and should issue a compiler error.


As for why std::optional throws, if you check its proposal, N3793 having a throwing accessor was advertised as an improvement over Boost.Optional, from which std::optional was born. I haven't yet found any discussion about why this is an improvement so for now I'll speculate: it was easy to provide both throwing and non-throwing accessors that satisfy both error-handling camps (sentinel values vs exceptions), and it additionally helps take some undefined behavior out of the language so you don't needlessly shoot yourself in the foot if you choose to go the potentially-throwing route.

AndyG
  • 39,700
  • 8
  • 109
  • 143
  • 1
    `valueless_by_exception` is not at all comperable to an empty `optional`. An `optional` being empty is a legitimate state; the object is completely functional (within its contract). A `variant` being valueless is not in a legitimate state. It can only get into that state via throwing an exception, and you pretty much can't do anything with a valueless `variant`. No visitation, no `get`, nothing. – Nicol Bolas Feb 16 '18 at 04:39
  • 3
    Also, the whole argument fails, because while `optional::value` throws, `optional::operator*` *does not*. – Nicol Bolas Feb 16 '18 at 04:39
  • 1
    @NicolBolas point taken about `valueless_by_exception`, but isn't "whole argument fails" a little extreme? I feel that the proposal makes it clear that the design was inspired by `optional`. `variant` has no comparable ` operator*` so it's hard to say that the behavior diverges there – AndyG Feb 16 '18 at 10:43
  • 1
    @Nicol the non throwing operator* is rather an optimization similar to vector::at vs vector::operator[]. The fact that no such optimization exists for variant could come from the fact that variant is supposed to protect you from invalid access (by throwing) in the first place, no? – rubenvb Feb 16 '18 at 13:07
  • 1
    @rubenvb: Which is my point. The choice not to have a non-checking `get` has nothing to do with `optional`; it purely has to do with the design of `variant`. – Nicol Bolas Feb 16 '18 at 14:42
  • @andyg - thank you! You are almost there, but as other commenters mentioned - I'd really like to know why no "operator*" like method. – Denis Yaroshevskiy Feb 16 '18 at 22:54
1

I think I found it!

Seems like the reason can be found under the "Differences to revision 5" in the proposal :

The Kona compromise: f !v.valid(), make get<...>(v) and visit(v) throw.

Meaning - that the variant has to throw in "values_by_exception" state. Using the same if we can always throw.

Even knowing this rational I personally would like to avoid this check. The *get_if work around from Justin's answer seems good enougth for me (at least for library code).

Denis Yaroshevskiy
  • 1,218
  • 11
  • 24