16

Why does the following print bD aD aB aA aC aU instead of aD aB aA aC bD aU? In other words, why is b-- evaluated before --++a--++?

#include <iostream>
using namespace std;

class A {
    char c_;
public:
    A(char c) : c_(c) {}
    A& operator++() {
        cout << c_ << "A ";
        return *this;
    }
    A& operator++(int) {
        cout << c_ << "B ";
        return *this;
    }
    A& operator--() {
        cout << c_ << "C ";
        return *this;
    }
    A& operator--(int) {
        cout << c_ << "D ";
        return *this;
    }
    void operator+(A& b) {
        cout << c_ << "U ";
    }
};

int main()
{
    A a('a'), b('b');
    --++a-- ++ +b--;  // the culprit
}

From what I gather, here's how the expression is parsed by the compiler:

  • Preprocessor tokenization: -- ++ a -- ++ + b --;
  • Operator precedence1: (--(++((a--)++))) + (b--);
  • + is left-to-right associative, but nonetheless the compiler may choose to evaluate the expression on the right (b--) first.

I'm assuming the compiler chooses to do it this way because it leads to better optimized code (less instructions). However, it's worth noting that I get the same result when compiling with /Od (MSVC) and -O0 (GCC). This brings me to my question:

Since I was asked this on a test which should in principle be implementation/compiler-agnostic, is there something in the C++ standard that prescribes the above behavior, or is it truly unspecified? Can someone quote an excerpt from the standard which confirms either? Was it wrong to have such a question on the test?

1 I realize the compiler doesn't really know about operator precedence or associativity, rather it cares only about the language grammar, but this should get the point across either way.

Konstantin
  • 2,885
  • 3
  • 22
  • 23
  • 7
    Had previously closed as dupe of http://stackoverflow.com/questions/4176328/undefined-behavior-and-sequence-points, but then noticed this was about overloaded operators. – Oliver Charlesworth Apr 06 '17 at 18:59
  • There is no constructor print, I assume you are printing `char c` in the constructor right? – Fantastic Mr Fox Apr 06 '17 at 19:03
  • @OliverCharlesworth Is there a distinction? It's still a ++/-- operator, so why isn't this undefined behavior? – Christopher Schneider Apr 06 '17 at 19:03
  • 2
    @ChristopherSchneider - Because with overloaded operators, these are function calls, and so introduce sequence points (or whatever they've been rebranded as in later variants of the C++ spec ;) – Oliver Charlesworth Apr 06 '17 at 19:05
  • @Fox No, the `c_` is printed in the overloaded operator methods to distinguish which instance of `A` the operator is for. – Konstantin Apr 06 '17 at 19:06
  • 2
    post-increment is supposed to return by value http://en.cppreference.com/w/cpp/language/operator_incdec – Ramon Apr 06 '17 at 19:07
  • 1
    @Ramon I am well aware of that, but that's not the point of the question. This isn't production code. – Konstantin Apr 06 '17 at 19:08
  • 2
    @KonstantinĐ. So from what you conclude, it boils down to `function1() + function2()`? We know the order is unspecified if that's what it all winds up being. – PaulMcKenzie Apr 06 '17 at 19:10
  • Also see [Why doesn't a+++++b work in C?](http://stackoverflow.com/q/5341202/1708801) – Shafik Yaghmour Apr 06 '17 at 19:15
  • @Paul Indeed. But I'm asking because I thought there may be something in the standard that I'm not aware of which _imposes_ this evaluation order since, again, this was a question on a test that's supposed to be implementation independent. – Konstantin Apr 06 '17 at 19:17
  • @Shafik Not really pertinent since the operators are overloaded to always return an lvalue. But thanks anyway. – Konstantin Apr 06 '17 at 19:17
  • 1
    Clang does not evaluate `b--` first. – AnT stands with Russia Apr 06 '17 at 19:24
  • @AnT Thank you — this is something I can use if the grader decides to deduct points for this. I'm also assuming that, therefore, it must be unspecified by the standard (as previously assumed). – Konstantin Apr 06 '17 at 19:26
  • @KonstantinĐ. -- Since you say this was a question on a test, and if you got marked wrong, have the teacher explain why the answer they think is right is actually correct. Have them quote the standard to you as to why they're correct. Bet they don't. – PaulMcKenzie Apr 06 '17 at 19:36
  • 2
    @KonstantinĐ. It is a common mistake to mix operator precedence and evaluation order. First used for syntax parsing, evaluation order is determined by different matter which operation need which data. – Slava Apr 06 '17 at 20:03
  • Short answer, "it isn't (necessarily)". – Toby Speight Apr 07 '17 at 08:34
  • 1
    Remember that precedence has nothing to do with evaluation of *operands*, only *execution of operators*. When you say `(a() + b()) - (c() * d()` then you know that `a()` and `b()` come before anything is added, that `c()` and `d()` come before anything is multiplied, and that all of the above happens before anything is subtracted. But the order in which `a()`, `b()`, `c()` and `d()` are called is not specified. If the compiler wants to generate it as `t4 = d(), t2 = b(), t1 = a(), t3 = c(), t5 = t1 + t2, t6 = t4 * t3, result = t5 - t6`, the compiler is entirely within its rights to do so. – Eric Lippert Apr 07 '17 at 11:22

5 Answers5

17

The expression statement

--++a-- ++ +b--;  // the culprit

can be represented the following way

at first like

( --++a-- ++ )  + ( b-- );

then like

( -- ( ++ ( ( a-- ) ++ ) ) )  + ( b-- );

and at last like

a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator  + ( b.operator --( 0 ) );

Here is a demonstrative program.

#include <iostream>
using namespace std;

#include <iostream>
using namespace std;

class A {
    char c_;
public:
    A(char c) : c_(c) {}
    A& operator++() {
        cout << c_ << "A ";
        return *this;
    }
    A& operator++(int) {
        cout << c_ << "B ";
        return *this;
    }
    A& operator--() {
        cout << c_ << "C ";
        return *this;
    }
    A& operator--(int) {
        cout << c_ << "D ";
        return *this;
    }
    void operator+(A& b) {
        cout << c_ << "U ";
    }
};

int main()
{
    A a('a'), b('b');
    --++a-- ++ +b--;  // the culprit

    std::cout << std::endl;

    a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator  + ( b.operator --( 0 ) );

    return 0;
}

Its output is

bD aD aB aA aC aU 
bD aD aB aA aC aU 

You can imagine the last expression written in the functional form like a postfix expression of the form

postfix-expression ( expression-list ) 

where the postfix expression is

a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator  +

and the expression-list is

b.operator --( 0 )

In the C++ Standard (5.2.2 Function call) there is said that

8 [Note: The evaluations of the postfix expression and of the arguments are all unsequenced relative to one another. All side effects of argument evaluations are sequenced before the function is entered (see 1.9). —end note]

So it is implementation-defined whether at first the argument will be evaluated or the postfix expression. According to the showed output the compiler at first evaluates the argument and only then the postfix expression.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
14

I would say they were wrong to include such a question.

Except as noted, the following excerpts are all from §[intro.execution] of N4618 (and I don't think any of this stuff has changed in more recent drafts).

Paragraph 16 has the basic definition of sequenced before, indeterminately sequenced, etc.

Paragraph 18 says:

Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.

In this case, you're (indirectly) calling some functions. The rules there are fairly simple as well:

When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is sequenced before B or B is sequenced before A.

Putting that into bullet points to more directly indicate order:

  1. first evaluate the function arguments, and whatever designates the function being called.
  2. Evaluate the body of the function itself.
  3. Evaluate another (sub-)expression.

No interleaving is allowed unless something starts up a thread to allow something else to execute in parallel.

So, does any of this change before we're invoking the functions via operator overloads rather than directly? Paragraph 19 says "No":

The sequencing constraints on the execution of the called function (as described above) are features of the function calls as evaluated, whatever the syntax of the expression that calls the function might be.

§[expr]/2 also says:

Uses of overloaded operators are transformed into function calls as described in 13.5. Overloaded operators obey the rules for syntax and evaluation order specified in Clause 5, but the requirements of operand type and value category are replaced by the rules for function call.

Individual operators

The only operator you've used that has somewhat unusual requirements with respect to sequencing are the post-increment and post-decrement. These say (§[expr.post.incr]/1:

The value computation of the ++ expression is sequenced before the modification of the operand object. With respect to an indeterminately-sequenced function call, the operation of postfix ++ is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single postfix ++ operator. —end note ]

In the end, however, this is pretty much just what you'd probably expect: if you pass x++ as a parameter to a function, the function receives the previous value of x, but if x is also in scope inside the function, x will have the incremented value by the time the body of the function starts to execute.

The + operator, however, does not specify ordering of the evaluation of its operands.

Summary

Using overloaded operators does not enforce any sequencing on the evaluation of sub-expressions within an expression, beyond the fact that evaluating an individual operator is a function call, and has the ordering requirements of any other function call.

More specifically, in this case, b-- is the operand to a function call, and --++a-- ++ is the expression that designates the function being called (or at least the object on which the function will be called--the -- designates the function within that object). As noted, ordering between these two is not specified (nor does operator + specify an order of evaluating its left vs. right operand).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • 1
    [over.match.oper] in C++17 specifies that for overloaded operators, "the operands are sequenced in the order prescribed for the built-in operator". Regardless, built-in `+` doesn't sequence, and so neither does an overloaded `+`. – T.C. Apr 06 '17 at 21:07
7

There is not something in the C++ standard which says things need to be evaluated in this way. C++ has the concept of sequenced-before, where some operations are guaranteed to happen before other operations are. This is a partially-ordered set; that is, sosome operations are sequenced before others, two operations can’t be sequenced before eath other, and if a is sequenced before b, and b is sequenced before c, then a is sequenced before c. However, there are many types of operation which have no sequenced-before guarantees. Before C++11, there was instead a concept of a sequence point, which isn’t quite the same but similar.

Very few operators (only ,, &&, ?:, and ||, I believe) guarantee a sequence point between their arguments (and even then, until C++17, this guarantee doesn’t exist when the operators are overloaded). In particular, the addition does not guarantee any such thing. The compiler is free to evaluate the left-hand side first, to evaluate the right-hand side first, or (I think) even to evaluate them simultaneously.

Sometimes changing optimization options can change the results, or changing compilers. Apparently you aren’t seeing that; there are no guarantees here.

Daniel H
  • 7,223
  • 2
  • 26
  • 41
  • ?: operator also guarantees that the condition expression is fully evaluated before the selected conditional expression. – Tomek Apr 06 '17 at 19:17
  • I like this answer most because it's very easy to understand, even if the higher voted ones are more detailed. – iFreilicht Apr 07 '17 at 09:06
4

Operator precedence and associativity rules are only used to convert your expression from the original "operators in expression" notation to the equivalent "function call" format. After the conversion you end up with a bunch of nested function calls, which are processed in the usual way. In particular, order of parameter evaluation is unspecified, which means that there's no way to say which operand of the "binary +" call will get evaluated first.

Also, note that in your case binary + is implemented as a member function, which creates certain superficial asymmetry between its arguments: one argument is "regular" argument, another is this. Maybe some compilers "prefer" to evaluate "regular" arguments first, which is what leads to b-- being evaluated first in your tests (you might end up with different ordering from the same compiler if you implement your binary + as a freestanding function). Or maybe it doesn't matter at all.

Clang, for example, begins with evaluating the first operand, leaving b-- for later.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • Replacing the method with a `friend` function doesn't change the output on any of the compilers I tried. What you said about the ‘asymmetry’ is interesting, though — I wonder if it's actually true on some compilers (if anyone can back this up, I'd appreciate it). – Konstantin Apr 06 '17 at 19:49
-1

Take in account priority of operators in c++:

  1. a++ a-- Suffix/postfix increment and decrement. Left-to-right
  2. ++a --a Prefix increment and decrement. Right-to-left
  3. a+b a-b Addition and subtraction. Left-to-right

Keeping the list in your mind you can easily read the expression even without parentheses:

--++a--+++b--;//will follow with
--++a+++b--;//and so on
--++a+b--;
--++a+b;
--a+b;
a+b;

And dont forget about essential difference prefix and postfix operators in terms of order evaluation of variable and expression ))