Which implementations are conformant for constant time dispatch of std::visit?

Question

It is often quoted that C++s std::variant has worse performance than variants in other languages. See e.g. https://pbackus.github.io/blog/beating-stdvisit-without-really-trying.html

One of the reasons is the exception state, the other was said to be the requirement of the standard that std::visit complexity does not depend on the number of types which forces implementers to use a dispatch table of function pointers which inhibits certain optimizations like inlining.

Questions:

Would it be legal to implement std::visit as a chain of if-elseif where each check is like if(v.index() == I) return visitor(get<I>(v));? It seems like it does not because dispatching to higher indices would take more checks for the index.
Is it legal to implement it with a switch like switch(v.index()) case I: return visitor(get<I>(v));? This looks like "independent of the number of types" as control flow jumps directly to the case and it is how e.g. Microsoft implemented it for less than 64 (or so) types.
If the switch is legal, is it legal for the optimizer to convert small switches to if-elseif ladders? Again the MSVC compiler does this: https://godbolt.org/z/D2Q5ED. But IMO this violates the "independent of number of types" complexity mandated by the standard although it would in practice be faster.

I'm asking based on a discussion on reddit where someone claimed "Constant time" again refers to the behavior in the limit. which I don't really agree with. IMO "Constant time" means the time is constant no matter the input which for an if-elseif-chain is not the case.

But I'm not sure how this applies here. It seems 1. is illegal, 2. is legal but the optimizer (may) transforms it into 1. which means that 2. is also illegal. At least if following the standard to the letter.

I think I read *has no complexity requirements*, not *does not depend on the number of types*? — apple apple, Oct 30 '19 at 12:40
What @appleapple said. *Has no complexity requirements* means it can be either of O(1), O(n), O(n^2), O(n^42), O(n!)... Also note: In terms of complexity *"Constant time"* is used as a synonym for O(1) which in turn means *"there is some constant C that, no matter what n is, serves as an upper bound (hence, being below C is fine)"*. I agree it's a bit of a misnomer, but *Constant time* certainly doesn't mean "always the same" in this context. — sebrockm, Oct 30 '19 at 13:09
@sebrockm: whereas `std::visit(f, var1, var2)` has no requirements, `std::visit(f)` and `std::visit(f, var1)` have requirements (doesn't depend of number of types). — Jarod42, Oct 30 '19 at 13:39
@Jarod42 you are right, I missed the first part for `visit` with zero or one arguments — sebrockm, Oct 30 '19 at 13:52
It’s never been clear to me what the normative impact of statements like “`visit` is O(1)” or “`array::swap` is O(n)” when the “variable” is a compile-time constant. Each instantiation is a different *algorithm*, not the same algorithm with different input. — Davis Herring, Oct 30 '19 at 15:52
The problem is that "complexity not dependent on Sizeof(Types..)" is either impossible (every compiler ever made would need to implement things the same way under all optimization levels) or trivial as Sizeof(Types...) is a compile-time constant so can't change at runtime. So its really a meaningless requirement. — Chris Dodd, Oct 30 '19 at 20:54
@DavisHerring: `std::visit` needs to invoke a "different function" (simplifying) based on the value of the `which` field of `std::variant`. Basically: Choose 1 of N calls (N is fixed at compile time) There are different algorithms to do that: Jump table, if-else chain with checking each value once, if-else using a binary search... They all have different runtime characteristics. In general: Switch is O(1), if-elseif is O(n) and binary search is O(log n). So the stdlib chooses one of those algorithms at compile time and the "different input" is the different (runtime) type the variant holds — Flamefire, Nov 03 '19 at 11:39
@Flamefire: Of course. But note that “choose as fast aa possible” **is** *O*(1) in this strange sense, because for large enough *n* the jump table wins. What the standard says is that the time “does not depend on the number of alternative types”, which seems to be a stronger statement but only in an unimplementable fashion. — Davis Herring, Nov 03 '19 at 11:56
This is the very point of this question: From "does not depend on the number of alternative types" I conclude that a jump table is the only valid implementation and no other algorithm for dispatch is allowed (even when it would have better running time). But it seems if you are allowed to switch implementations than it might be possible to use an `O(n)` for small n and `O(1)` for larger N. See my comment to the answer of @sebrockm I guess this was your point? That you can switch algorithms based on `N` where `O(...)` becomes hard to define? — Flamefire, Nov 03 '19 at 12:07
@Flamefire this is what I tried to explain in my answer. "does not depend on the number of types" translates to "in O(*something*) there is no n in *something*". This in turn means it's O(1) as it is the only class without n. Also if you "use an O(n) for small n" (where n is bounded by some constant C) then effectively this is no O(n) anymore, but O(C) which is O(1). — sebrockm, Nov 03 '19 at 12:20

sebrockm · Accepted Answer · 2019-11-04T10:13:04.063

You are confusing time complexity and running time. See for example here .

This being said, it's important to realize what statements like "constant time" and "does not depend on" mean when talking about complexity: In this context, both of them are just synonyms for O(1). So, the claim in the reddit comment is indeed correct.

In the case of visit this means that there is some constant C such that visit never takes more than C cpu-cycles, even for an insane number of types in the variant. In particular, it is of course allowed to stay below this number C.

So, to actually answer your question: Using an if-else if chain exclusively will usually have linear (i.e. O(n)) complexity, as you correctly state. So, unless the implementer knows for a fact that the compiler will optimize it into something constant, this is illegal.

A switch will very likely be compiled to a jump table, which works in O(1). This is especially likely if the cases are a subsequent number of integers, as it is the case here. So, an implementation using switch should be legal. Note however, that for a difficult distribution of cases a switch will be compiled as a binary search which is O(log n). See e.g here.

And lastly, given what I said earlier, it is of course allowed to apply any sorts of optimizations, as long as this doesn't move the algorithm into a worse complexity class. Even if that means that the actual running time does depend on the number of types, the time complexity does not.

Update:

Based on your comments, let me give you an examples of a function that is, complexity wise, independent on n (aka. "constant" or O(1)) while the actual runtime of course does depend on n:

int get_max_of_first_50(std::vector<int> const& v)
{
    int max = std::numeric_limits<int>::min();
    for (int i = 0; i < 50; ++i)
        if (v[i] > max)
            max = v[i];

    return max;
}

This function will always do 50 iterations or less, no matter how huge v.size() might be. Hence, complexity wise, it is classified as a constant algorithm, independent on n, or O(1); all three statements are equivalent.

This may look like cheating, and to a certain degree I can understand this reasoning, but it is important to understand that this is the very reason why complexity was introduced this way in the first place:

Allowing certain simplifications in the analysis of algorithms. It can easily become very difficult otherwise
Caring only about huge inputs. Usually, nobody cares if something is done in 5ms or 10ms. But 5 days vs. 10 days does matter.

BTW this way of "cheating" is very common. Another famous example is the implementation of std::sort. Basically, most library implementations that I've heard of use Quicksort (with some tricks, so it is O(n log(n)) even in the worst case). But for small sequences they usually switch to Insertion Sort which is provably faster in very small ranges, even though it is an algorithm with O(n^2) complexity by itself. But as already explained, from the point of view of complexity, this is fine as long as its usage is bound by a constant. And from the point of view of actual runtime, this is of course also fine as it is faster.

I actually think that OP knows the difference. OP asked why forcing to use a "slow" `O(1)` instead of a fast `O(n)`, as in usage, `n` is limited. — Jarod42, Oct 31 '19 at 15:19
Good explanation, although I was not confusing them. As you agreed `if-elseif` is `O(n)` and hence disallowed for `std::visit` which was my first point. But it seems you argue on the right track: If you implement an `if-elseif` for small N and a jump table for large N can can still find a bound `C` for the general "super-algorithm" `std::visit`: Namely the maximum `N` for which the `if` stuff is used. This feels super-strange though, like cheating. I also disagree with `allowed to apply any sorts of optimizations`: If a compiler compiles my big switch into an if-else it changed the complexity! — Flamefire, Nov 03 '19 at 11:47
@Flamefire I shouldn't have said *any* optimization. Only optimizations that don't increase the complexity are allowed. But I believe this is the core fact that you didn't fully get yet: doing an O(n) only if n is smaller than some C, is in the world of complexity considered as O(1) because the number of operations is limited by a constant. I will update my answer to explain that a little better — sebrockm, Nov 03 '19 at 12:33
I'll accept that. However I find the standard then confusing in specifying that "`std::visit` does not depend on the number of types" when for small N it does/can. This might be relevant in some places. E.g. realtime (suddenly an operation takes slightly longer than all before) or for side-effect observation (infer type of variant by observing std::visit dispatch) — Flamefire, Nov 04 '19 at 10:04
@Flamefire I agree that this wording can be confusing. But in the world of complexity this wording is very common and everybody familiar with the topic knows what it means. And given that in the standard this wording is used in the paragraph about Complexity, it should be kind of clear to the reader that "does not depend on" literally translates to "is O(1)" and not to "runtime will always be the same". — sebrockm, Nov 04 '19 at 10:17
It's also not "5 days vs 10 days" where we care, more like "5 days vs 5 billion years" — Caleth, Nov 04 '19 at 10:47

Which implementations are conformant for constant time dispatch of std::visit?

1 Answers1

Update: