4

Possible Duplicate:
Undefined Behavior and Sequence Points

As defined in the standard, E1 += E2 is almost same to E1 = E1 + E2 except that E1 is only evaluated once. So, in addition, would "p += (*p)++ + c"; cause an undefined behavior?

Try the following code in gcc/g++ (4.7 / 4.4). There are 2 kind of results: bxxxxx (g++4.7) or axbxxx (gcc, g++ 4.4). If we're executing (1) but not (2) in the code, we can only get axbxxx.

#include <stdio.h>

int main() {
    char s[] = "axxxxx";
    char *p = s;

    printf("s = %s in the beginning.\n"
           "p is pointed at the %d-th char.\n", s, p - s);
    //p = p + (*p)++ * 3 + 2 - 'a' * 3; // (1)
    p += (*p)++ * 3 + 2 - 'a' * 3; // (2)
    printf("p is moved ahead by %d steps\n", p - s);
    printf("s = %s after the operation.\n", s);
    return 0;
}

I can't find why it cause undefined behavior, nor can I assert that it's a bug of gcc.

For the axbxxx result, I also can't understand why a operand or post ++ is evaluated twice (once getting the value, and later saving it). Since in the standard says "1 ... is added to it", I think the address should only be evaluated once. If the address of the operand of the post ++ is evaluated only once, the effect of the expression will be the same despite in whatever order the assignments are executed.

=== UPDATE ===

After reading the document linked in the first comment, I think the following rule may matter:

"2) Furthermore, the prior value shall be accessed only to determine the value to be stored." .

So, would the access of p in "p = p + (*p)++ * 3 + c" be considered a part of "prior value" of *p which has nothing to do with the value to be stored in *p?

IMO, this rule is not violated.

Community
  • 1
  • 1
Zhe Yang
  • 312
  • 2
  • 9

4 Answers4

3

No, p = p + (*p)++ * 3 + c doesn't cause any undefined behavior, assuming that p does not point at c.

In this case the questionable part is the read and modification of the value of *p inside the expression. However, that value is read for the purpose of determining the new value of p (there's a straightforward data dependency of the new value of p on the value read in *p), so it does not violate the requirements.

I'd guess that the bug in the compiler is actually rooted in its incorrect behavior in an unspecified situation. Note that the expression has two side-effects: store the new value in p and store the new value in *p. It is unspecified in which order these side-effects occur. However, during the evaluation of the (*p)++ subexpression the compiler was supposed to "fix" the specific lvalue argument of ++ to make sure that the new (incremented) value was stored in that exact object. It looks like the older version of the compiler failed to do that, i.e. the new value of p is evaluated first and then the new value of *p is stored through the new value of p. This is obviously incorrect.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
1

In principle, the statement p += (*p)++ + c; is potentially correct. All it does is advance a pointer (p) by some value, which happens to be determined by a variable to which p points.

You just have to make sure that you never increment p beyond s + 7. I didn't check your code care­fully to see whether that's the case (but note that you're making certain encoding contiguity assump­tions).

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
1

Note that p += x; is not equivalent to p = p + x;, but to p = p + (x);. x is evaluated first and the result is added to p. In the formular without parentheses as given in the title, an intermediate value for p could point outside of the array which is indeed undefined behaviour. The version in your code should be fine as long as the result for x is inside the array.

6.5.16.2.3 A compound assignment of the form E1 op= E2 differs from the simple assignment expression E1 = E1 op (E2) only in that the lvalue E1 is evaluated only once.

J.2 Undefined behavior - Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object (6.5.6).

This UB definition is not restricted to the final result of the assignment.

Secure
  • 4,268
  • 1
  • 18
  • 16
0

p += (*p)++ * 3 + 2 - 'a' * 3 is not of the form E1 = E1 + E2.

  • On the right side you have a pointer(address variable)
  • On the left side you increment the variable addressed by this pointer.

EDIT: Spotted p+=

Still not undefined, because whatever the order of evaluation of each expression on the right side of E1 = E1 + E2, the value of p is unchanged.

UmNyobe
  • 22,539
  • 9
  • 61
  • 90