Is the three-way comparison operator always efficient?

Question

Herb Sutter, in his proposal for the "spaceship" operator (section 2.2.2, bottom of page 12), says:

Basing everything on <=> and its return type: This model has major advantages, some unique to this proposal compared to previous proposals for C++ and the capabilities of other languages:

[...]

(6) Efficiency, including finally achieving zero-overhead abstraction for comparisons: The vast majority of comparisons are always single-pass. The only exception is generated <= and >= in the case of types that support both partial ordering and equality. For <, single-pass is essential to achieve the zero-overhead principle to avoid repeating equality comparisons, such as for struct Employee { string name; /*more members*/ }; used in struct Outer { Employeee; /*more members*/ }; – today’s comparisons violates zero-overhead abstraction because operator< on Outer performs redundant equality comparisons, because it performs if (e != that.e) return e < that.e; which traverses the equal prefix of e.name twice (and if the name is equal, traverses the equal prefixes of other members of Employee twice as well), and this cannot be optimized away in general. As Kamiński notes, zero-overhead abstraction is a pillar of C++, and achieving it for comparisons for the first time is a significant advantage of this design based on <=>.

But then he gives this example (section 1.4.5, page 6):

class PersonInFamilyTree { // ...
public:
  std::partial_ordering operator<=>(const PersonInFamilyTree& that) const {
    if (this->is_the_same_person_as ( that)) return partial_ordering::equivalent;
    if (this->is_transitive_child_of( that)) return partial_ordering::less;
    if (that. is_transitive_child_of(*this)) return partial_ordering::greater;
    return partial_ordering::unordered;
  }
  // ... other functions, but no other comparisons ...
};

Would define operator>(a,b) as a<=>b > 0 not lead to large overhead? (though in a different form than he discusses). That code would first test for equality, then for less, and finally for greater, rather than only and directly testing for greater.

Am I missing something here?

I agree with you. In this case, where two items being unrelated is a possibility, and therefore an exhaustive search is required, implementation in terms of spaceship operator would provide for more maintainability at the possible expense of a suboptimal runtime experience. — Richard Hodges, Jan 02 '18 at 18:31
What about optimization? If You do ```a<=>b > 0``` what's the use of going into the less and equivalent branches? — Robert Andrzejuk, Jan 31 '18 at 22:19
@RobertAndrzejuk - that only can happen if the compiler can prove that both `is_the_same_person_as` and `is_transitive_child_of` are side-effect-free, and that no more than one of the if statement conditions can ever be true at the same time. Highly unlikely, in other words. — TLW, Sep 25 '22 at 17:43

score 10 · Accepted Answer · answered Jan 02 '18 at 18:32

10

Would define operator>(a,b) as a<=>b > 0 not lead to large overhead?

It would lead to some overhead. The magnitude of the overhead is relative, though - in situations when costs of running comparisons are negligible in relation to the rest of the program, reducing code duplication by implementing one operator instead of five may be an acceptable trade-off.

However, the proposal does not suggest removing other comparison operators in favor of <=>: if you want to overload other comparison operators, you are free to do it:

Be general: Don’t restrict what is inherent. Don’t arbitrarily restrict a complete set of uses. Avoid special cases and partial features. – For example, this paper supports all seven comparison operators and operations, including adding three-way comparison via <=>. It also supports all five major comparison categories, including partial orders.

answered Jan 02 '18 at 18:32

Sergey Kalinichenko

714,442
84
1,110
1,523

1

Sure, you can still overload the operators manually, but Herb's comment of "finally achieving zero-overhead abstraction for comparisons" seems to not always be true. – Cris Luengo Jan 03 '18 at 17:39
Herb's use of *zero-overhead abstraction* in this case means that given a type with only `operator<` and `operator==` (from a library for example) the the user could not write a better implementation of the question `<=` by hand than the compiler, if the library had instead provided, for example `operator<=>`. The use of this new language feature will (in the case of an efficient three-way comparison as in lexicographic ordering) not incur overhead for the users down the line. That is not true in the question above though but since you don't pay for it if you don't use it, this is ok. – Araeos Feb 03 '20 at 12:57
@Araeos - "since you don't pay for it if you don't use it" is untrue if, for instance, the library function you're using decides to implement `operator<=` / etc in terms of `<=>`. – TLW Sep 25 '22 at 17:45

score 9 · Answer 2 · answered Jan 02 '18 at 18:39

9

For some definition of large. There is overhead because in a partial ordering, a == b iff a <= b and b <= a. The complexity would be the same as a topological sort, O(V+E). Of course, the modern C++ approach is to write safe, clean and readable code and then optimizing. You can choose to implement the spaceship operator first, then specialize once you determine performance bottlenecks.

answered Jan 02 '18 at 18:39

user9163035

237
1
4

I agree with your comment about writing clean code first and optimizing later. That's always solid advice! – Cris Luengo Jan 03 '18 at 17:36

Nicol Bolas · Answer 3 · 2018-02-17T01:56:23.983

Generally speaking, overloading <=> makes sense when you're dealing with a type where doing all comparisons at once is either only trivially more expensive or has the same cost as comparing them differently.

With strings, <=> seems more expensive than a straight == test, since you have to subtract each pair of two characters. However, since you already had to load each pair of characters into memory, adding a subtraction on top of that is a trivial expense. Indeed, comparing two numbers for equality is sometimes implemented by compilers as a subtraction and test against zero. And even for compilers that don't do that, the subtract and compare against zero is probably not significantly less efficient.

So for basic types like that, you're more or less fine.

When you're dealing with something like tree ordering, you really need to know up-front which operation you care about. If all you asked for was ==, you really don't want to have to search the rest of the tree just to know that they're unequal.

But personally... I would never implement something like tree ordering with comparison operators to begin with. Why? Because I think that comparisons like that ought to be logically fast operations. Whereas a tree search is such a slow operation that you really don't want to do it by accident or at any time other than when it is absolutely necessary.

Just look at this case. What does it really mean to say that a person in a family tree is "less than" another? It means that one is a child of the other. Wouldn't it be more readable in code to just ask that question directly with is_transitive_child_of?

The more complex your comparison logic is, the less likely it is that what you're doing is really a "comparison". That there's probably some textual description that this "comparison" operation could be called that would be more readable.

Oh sure, such a class could have a function that returns a partial_order representing the relationship between the two objects. But I wouldn't call that function operator<=>.

But in any case, is <=> a zero-overhead abstraction of comparison? No; you can construct cases where it costs significantly more to compute the ordering than it does to detect the specific relation you asked for. But personally if that's the case, there's a good chance that you shouldn't be comparing such types through operators at all.

Is the three-way comparison operator always efficient?

3 Answers3

Linked