4

Just to see how much I understood how the ++c/c++ operator works, I tried to run these C programs:

int c = 5;
c = c - c++;
printf("%d\n", c);

prints 1, I guess the logic is that the ++ is applied after the line of code where it's used, so c becomes = c - c which is 0, and on the "next line" it's increased by one. But it seems strange to me, I'd like to know more in detail what should happen with regards to the operators priority.

Now on to this:

int c = 5;
c = c - ++c;
printf("%d\n", c);

this one prints 0, and I can't really understand why. If right hand values are parsed from left to right, I guess it would read c which is 5, then ++c which is 6 as it should be applied immediately. Or does it calculate the ++c before the whole right hand value calculation, so that it's actually doing 6 - 6 because the increment also involves the first calling of c?

Paul Renton
  • 2,652
  • 6
  • 25
  • 38
memememe
  • 663
  • 6
  • 21
  • 3
    Are you asking about C or C++? The answer is actually different between the two in this case. – George Dec 08 '19 at 05:26

2 Answers2

5

For C++ (all versions, explanation applies to C++11 and later):

Both have undefined behavior, meaning that not only is the value that it will return unspecified, but that it causes your whole program to behave in an undefined manner.

The reason for this is that evaluation order inside an expression is only specified for certain cases. The order in which expressions are evaluated does not follow the order in the source code and is not related to operator precedence or associativity. In most cases the compiler can freely choose in which order it will evaluate expressions, following some general rules (e.g. the evaluation of an operator is sequenced after the value computation of its operands, etc.) and some specific ones (e.g. &&'s and ||'s left-hand operands are always sequenced before their right-hand operands).

In particular the order in which the operands of - are evaluated is unspecified. It is said that the two operands are unsequenced relative to one another. This in itself means that we won't know whether c on the left-hand side of c - [...] will evaluate to the value of c before or after the increment.

There is however an even stricter rule forbidding the use of a value computation from a scalar object (here c) in a manner unsequenced relative to a side effect on the same scalar object. In your case both ++c and c++ cause side effects on c, but they are unsequenced with the use of the value on the left hand side of c - [...]. Not following this rule causes undefined behavior.

Therefore your compiler is allowed to output whatever it wants and you should avoid writing code like that.

For a detailed list of all the evaluation order rules of C++, see cppreference.com. Note that they changed somewhat with the different C++ versions, making more and more previously undefined or unspecified behavior defined. None of these changes apply to your particular case though.

walnut
  • 21,629
  • 4
  • 23
  • 59
  • I don't understand why it would cause the whole program to behave undefinedly though. Is it just because the value could be anything and thus by using it I influence the rest of the program, or is there something more going on? – memememe Dec 09 '19 at 08:27
  • 1
    @memememe This is just how C++ works. The C++ standard says that doing this is undefined behavior meaning that the compiler is allowed to do whatever. In practice compilers will usually do something that is sensible to some degree (because they have no reason to be malicious on purpose), but you have no guarantee. Usually the motivation behind making stuff like this undefined behavior is that it allows the compiler to optimize code more aggressively or make some other task of the compiler simpler. I don't know the specific reason for making this case undefined behavior at the top of my head. – walnut Dec 09 '19 at 14:29
  • 1
    See also [Undefined, unspecified and implementation-defined behavior](https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior). – walnut Dec 09 '19 at 14:29
3
c = c - c++;

In C, this is a very bad idea(a). You are not permitted to modify and modify/use the same object without an intervening sequence point, and that subtraction operator is not a sequence point.

The things that are sequence points can be found in Annex C of the ISO standard.


(a) Technically, the behaviour of each operation (the evaluation of c1 and c++, and the assignment to c) is well defined but the sequencing is either unsequenced or indeterminate. In the former case, actions from each part can interleave while, in the latter, they do not interleave but you don't know in which order the two parts will be done.

However, the standard C11 6.5/2 also makes it clear that a sequencing issue using the same variable is undefined behaviour:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

Bottom line, it's not something you should be doing.

S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Does this mean that it is not *undefined behavior*, but merely that the value is the result of either of the evaluation orders? I would expect C to have weaker guarantees than C++ for something like this. – walnut Dec 08 '19 at 06:03
  • Is this undefined behavior or unspecified behavior? – Aykhan Hagverdili Dec 08 '19 at 06:45
  • @walnut et al, undefined behaviour means *anything* could happen. Indeterminate sequencing of actions `A` and `B` simply means it could be `AB` or `BA`. It does not have the same free-for-all set of possibilities that UB does. For unsequenced operations, it's even worse since it could be something like `A-part1, B, A-part2`. – paxdiablo Dec 08 '19 at 08:06
  • @paxdiablo I was asking, because I expect C to have an equivalent rule to C++ which not only causes indeterminate or unsequenced behavior, but also explicitly undefined behavior if side effects on a variable are unsequenced with the use of a value of the same. Looking through the [last C18 draft](https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf), I think that rule is indeed present in §6.5/2. – walnut Dec 08 '19 at 08:12
  • Interesting. I normally don't worry too much about C17 since it wasn't a *real* iteration (it was only done that way because ISO rules forbid more than 5 TRs (or some other sort of change) before a re-issue. Hence it was mostly just a bug-fix release. That's probably the wrong attitude on my part and irrelevant in this case since the *same* test appears in C11. Keep in mind that clause is for *unsequenced* operations, I'm not sure whether `var - var++` is unsequenced or indeterminately sequenced. – paxdiablo Dec 08 '19 at 08:30
  • The behavior of a program that attempts to execute that expression is undefined in every revision of the C standard, for the same exact reason (it was phrased slightly differently: pre-11 lack of sequence point, post-11, lack of sequencing). – Cubbi Jan 09 '20 at 17:42
  • (you also appear to be mixing together behavior (undefined, unspecified, etc) and sequencing (unsequenced, indeterminate, etc): these are different terminology categories that apply to different things) – Cubbi Jan 09 '20 at 18:15
  • @Cubbi, I'm not sure why you think I'm mixing them up, I *clearly* called this out as a sequencing issue (covered in J1 Unspecified rather than J2 Undefined). However, you're correct that other parts of the standard make it undefined, so I'll try to clarify. – paxdiablo Jan 09 '20 at 21:28