5

I have another C pointers question.

Consider executing the following program:

int x[5] = {0,3,5,7,9};
int* y = &x[2];
*(y+2) = *(y--);

What values does the array x hold afterwards?

What the hell is going on with y--? I know how *(y+2) works, and understand the rest, but not how y-- ties in with the rest.

Also, the answer given is {0, 3, 5, 5, 9}.

Community
  • 1
  • 1
cmcsorley17
  • 146
  • 8
  • 1
    "answer given" by whom? Why should you trust them? – n. m. could be an AI Dec 18 '14 at 14:40
  • 1
    Answer given by the university. I have no reason not to trust them, and going back and looking at similarly structured questions, it seems they are consistent in having the equivalent of {0,3,5,5,9} as the answer each time. (Which, for my purposes, is all I'm interested in.) – cmcsorley17 Dec 18 '14 at 15:14
  • 5
    "have no reason not to trust them". Well, *now* you have a reason, and a rather solid one. These guys are incompetent on a very basic level and should not be let within a mile of teaching programming. If you are only interested in being consistent with them, i.e. getting grades, that's understandable, but it has nothing to do with being correct. – n. m. could be an AI Dec 18 '14 at 16:35
  • @2501: It seems you're misusing bounties; they don't exist to award extra points to an existing answer. Furthermore, it now prevents this question from being closed as a duplicate. – Oliver Charlesworth Dec 22 '14 at 15:55
  • *One or more of the answers is exemplary and worthy of an additional bounty.*?? SO obviously thinks otherwise. – 2501 Dec 22 '14 at 15:56
  • @2501: Interesting, I didn't realise that was an official bounty reason. So apologies; I stand corrected! Nevertheless, despite the good answer, this question is clearly a duplicate. The answers below don't really offer anything that isn't already covered in the answers I linked to above... – Oliver Charlesworth Dec 22 '14 at 15:57
  • @OliverCharlesworth I don't agree, this is a specific example, whose answer involves sequence points. That doesn't make it a duplicate since the question isn't about them but a specific piece of code. Also dereferences make it more complicated. – 2501 Dec 22 '14 at 15:59
  • 1
    @2501: There are an infinite number of possible questions, all of which boil down to some straightforward code that is simply missing sequence points. Having answers scattered between them benefits nobody, especially when there are already canonical answers on existing questions. – Oliver Charlesworth Dec 22 '14 at 16:00
  • @OliverCharlesworth Again I don't agree. This is an interesting example that might benefit a non expert. Given the number of incorrect answers it received, most of which are deleted, it does. The unusual syntax is what makes it more more original. – 2501 Dec 22 '14 at 16:04
  • @2501: I suppose we'll have to agree to disagree. That there are a number of people who aren't familiar with sequence points doesn't mean that this isn't a duplicate. One of the answers to the linked question discusses an essentially identical code construct (`a[i] = i++`). – Oliver Charlesworth Dec 22 '14 at 16:09
  • @OliverCharlesworth I just noticed, how can you consider a closing as duplicate that is C++, while this is C? – 2501 Dec 22 '14 at 16:16
  • @2501: Fair point, that's C++. Try this other classic instead! http://stackoverflow.com/questions/949433/why-are-these-constructs-using-undefined-behavior (see the second answer). – Oliver Charlesworth Dec 22 '14 at 16:24
  • @OliverCharlesworth I don't mind if you close it after the bounty period. Hopefully a decent amount of readers will learn something in the meantime. – 2501 Dec 22 '14 at 16:33

4 Answers4

14

There's no sequence point between y-- and y + 2 in *(y+2) = *(y--);, so whether y + 2 refers to &x[4] or &x[3] is unspecified. Depending on how your compiler does things, you can either get 0 3 5 5 9 or 0 3 5 7 5.

What it means that there is no sequence point between the two expressions is, in a nutshell, that it is not specified whether the side effects of one operation (--y in this case) have been applied by the time the other (y - 2) is evaluated. You can read more about sequence points here.

ISO/IEC 9899:201x

6.5 Expressions

p2: If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

Community
  • 1
  • 1
Wintermute
  • 42,983
  • 5
  • 77
  • 80
  • 3
    Actually you may just as well get `42 42 42 42 42` or a formatted hard drive - undefined behavior doesn't mean that the compiler has to pick one of the "reasonable" options! – Voo Dec 22 '14 at 15:47
  • The behavior is not undefined but unspecified, meaning that there are two or more (two in this case) ways it can go and it is unspecified which is taken (see 3.4.4 in C99, as opposed to 3.4.3). This particular case is mentioned in appendix J.1 on page 487 (bottom of the page). – Wintermute Dec 22 '14 at 16:03
  • 1
    @ Wintermute Actually it is undefined, as you can see in my edit to your answer. @Voo game a good description of undefined behavior. – 2501 Dec 22 '14 at 16:06
  • Oh, then that was changed in C11. The same passage in C99 only says that the behavior is undefined if the value of `y` is changed twice in the expression. (I'm afraid I don't have the C11 standard to compare) – Wintermute Dec 22 '14 at 16:10
  • @Wintermute It is also undefined in C99. The second paragraph has a footnote that explicitly shows that. – 2501 Dec 22 '14 at 16:22
  • J.1 has it explicitly under "unspecified behavior," though. Language law aside, I think we're all agreed that depending on this sort of thing is a Bad Idea™, and that whether or not an implementation that formats your hard drive in such cases is still standard-compliant, it is unlikely that such an implementation will be written. – Wintermute Dec 22 '14 at 16:33
  • 1
    @Wintermute There is an important difference between unspecified and undefined. Your can actually write a valid and portable program with unspecified behavior, but not with undefined. – 2501 Dec 22 '14 at 16:35
  • I'd argue that a program that depends on unspecified (or even implementation-defined) behavior is not portable. – Wintermute Dec 22 '14 at 16:38
  • Actually it is done all the time. You just don't realize it. Here is a completely defined and portable program with unspecified behavior: http://ideone.com/RBTNPC or as simple as this: http://ideone.com/3H4Xuo – 2501 Dec 22 '14 at 17:07
  • Ah. I concede the point. – Wintermute Dec 22 '14 at 17:24
  • While there is only one side effect in the program, the value of the scalar object is used in a value computation as I understand it. Footnote 84 in John Bode's answer gives pretty much the same example and lists it as undefined. @2501 gives great examples of why it's an important difference though - hadn't considered it like that before either! – Voo Dec 22 '14 at 18:56
10

You should not trust the answers given by your professor in this case.

Expanding on Wintermute's answer a bit...

The problem is with the statement

*(y+2) = *(y--);

The expression y-- evaluates to the current value of y, and as a side effect decrements the variable. For example:

int a = 10;  
int b;

b = a--;

After the above expression has been evaluated, b will have the value 10 and a will have the value 9.

However, the C language does not require that the side effect be applied immediately after the expression has been evaluated, only that it be applied before the next sequence point (which in this case is at the end of the statement). Neither does it require that expressions be evaluated from left to right (with a few exceptions). Thus, it's not guaranteed that the value of y in y+2 represents the value of y before or after the decrement operation.

The C language standard explicitly calls operations like this out as undefined behavior, meaning that the compiler is free to handle the situation in any way it wants to. The result will vary based on the compiler, compiler settings, and even the surrounding code, and any answer will be equally correct as far as the language definition is concerned.

In order to make this well-defined and give the same result, you would need to decrement y before the assignment statement:

y--;
*(y+2) = *y; 

This is consistently one of the most misunderstood and mis-taught aspects of the C language. If your professor is expecting this particular result to be well-defined, then he doesn't know the language as well as he thinks he does. Then again, he's not unique in that respect.

Repeating and expanding on the snippet from the C 2011 draft standard that Wintermute posted:

6.5 Expressions
...
2 If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)

3 The grouping of operators and operands is indicated by the syntax.85) Except as specified later, side effects and value computations of subexpressions are unsequenced.86)
84) This paragraph renders undefined statement expressions such as
    i = ++i + 1;
    a[i++] = i;
while allowing
    i = i + 1;
    a[i] = i;

85) The syntax specifies the precedence of operators in the evaluation of an expression, which is the same as the order of the major subclauses of this subclause, highest precedence first. Thus, for example, the expressions allowed as the operands of the binary + operator (6.5.6) are those expressions defined in 6.5.1 through 6.5.6. The exceptions are cast expressions (6.5.4) as operands of unary operators (6.5.3), and an operand contained between any of the following pairs of operators: grouping parentheses () (6.5.1), subscripting brackets [] (6.5.2.1), function-call parentheses () (6.5.2.2), and the conditional operator ? : (6.5.15). Within each major subclause, the operators have the same precedence. Left- or right-associativity is indicated in each subclause by the syntax for the expressions discussed therein.

86) In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations.

Emphasis added. Note that this has been true since the C89 standard, although the wording has changed a bit since then.

"Unsequenced" simply means it's not guaranteed that one operation is completed before the other. The assignment operator does not introduce a sequence point, so it's not guaranteed that the LHS of the expression is evaluated before the RHS.

Now for the hard bit - your professor obviously expects a specific behavior for these kinds of expressions. If he gives a test or a quiz that asks what the result of something like a[i] = i--; will be, he's probably not going to accept an answer of "the behavior is undefined", at least not on its own. You might want to discuss the answers Wintermute and I have given with him, along with the sections of the standard quoted above.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • Interesting, thank you. However, the questions of this type are given in multiple choice format, so I really have no choice but to go along with this being the result (to a certain extent) :) – cmcsorley17 Dec 18 '14 at 16:19
  • 3
    @cmcsorley17: yeah, I was afraid of that. Don't torpedo your grade over this, just be aware of how the language really works. – John Bode Dec 18 '14 at 16:38
  • Is there an option on the multiple choice such as: E) My professor is now officially less knowledgeable than I am because I checked StackOverflow? I'll guess not, so just choose option "C". – frasnian Dec 24 '14 at 16:06
  • @frasnian: Like I said, the professor isn't unique in not correctly understanding this aspect of the language. This little nugget of misinformation has metastasized in way too many tutorials and references. Hell, there was a time when I thought it worked that way too, and it's taken what I consider an embarrassingly long time to correct that. – John Bode Dec 24 '14 at 16:31
  • The correct way to handle the test/grade/politics of the situation is to take the test and let the chips fall where they may. Following the exam, schedule time to go talk with the professor and let him know of your concerns. Not only does he need to be educated regarding the issue, you may very well find quite a bit of deference given your way in the way any curve applied to the exam is given. As others have mentioned, professors are not infallible. While we want to believe we are getting quality education, truth be known, this guy probably never saw C before the semester began. – David C. Rankin Dec 24 '14 at 23:32
  • It will be hard to reproduce the professors flawed way of reasoning. Am I correct if I assume that the dreaded professor *assumes* that `=` introduces a sequence point? – wildplasser Dec 24 '14 at 23:57
0

The problem is in this statement.

*(y+2) = *(y--);

Because in C, reading a variable twice in an expression (in which it's modified) has undefined behavior.

Another example is:

i = 5;
v[i] = i++;

In this case the most likely to happen (AFAIK) is that the compiler first evalue RHS or LHS, if LHS is first evaluated, then we will have v[5] = 5; and after the assignment i will be equal to 6, if instead of that RHS is evaluated in the first place, then we will have that the evaluation of the right side will be equal to 5, but when we start evaluating the left side i will be equal to 6, so we will end up with v[6] = 5;, however, given the quote "undefined behavior allow the compiler to do anything it chooses, even to make demons fly out of your nose" you should not expect one of those options, instead of that you should expect anything, because it depends on the compiler what happens.

Patricio Sard
  • 2,092
  • 3
  • 22
  • 52
0

First of all int x[5] = {0, 3, 5, 7, 9} means

x[0] = 0, x[1] = 3, x[2] = 5, x[3] = 7, x[4] = 9

Next int *y = &x[2] Here you are trying to use pointer y to point the address of x[2]

Now here comes to your confusion *(y + 2) means you are pointing address of x[4] and *(y--), here y-- is a post decrement operator, hence first of all the the value at *y must be used which is x[2] = 5 so now the value assigned is x[4] = 5.

The final output would be 0 3 5 7 5

arahan567
  • 227
  • 2
  • 10
  • 2
    Wrong. Have you read and understood the discussion (about sequence points) at the question and answers above ? – wildplasser Dec 25 '14 at 19:23