27

This code is taken from a discussion going on here.

someInstance.Fun(++k).Gun(10).Sun(k).Tun();

Is this code well-defined? Is ++k in Fun() evaluated before k in Sun()?

What if k is user-defined type, not built-in type? And in what ways the above function calls order is different from this:

eat(++k);drink(10);sleep(k);

As far as I know, in both situations, there exists a sequence point after each function call. If so, then why can't the first case is also well-defined like the second one?

Section 1.9.17 of the C++ ISO standard says this about sequence points and function evaluation:

When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body. There is also a sequence point after the copying of a returned value and before the execution of any expressions outside the function.

Community
  • 1
  • 1
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • 4
    there are lots of pit & falls in program languages, for something you are not sure, just avoid them... maybe this question can be answered in C++ spec, or not.. – linjunhalida Jan 17 '11 at 03:24
  • @Nawaz : It seems I am wrong. @jalf has made a valid point. Deleting my answer. – Prasoon Saurav Jan 17 '11 at 03:54
  • The behaviour is undefined because "the accesses of k precede its modification" (dunno how I missed that :( ). BTW Tony's code gives different outputs on g++, Clang/IntelC++. Plus I even a get a warning (operation on k may be undefined) on g++. – Prasoon Saurav Jan 17 '11 at 04:01
  • 2
    Keep your answer IMO. I'm not 100% sure on mine, so I think we're better off leaving both answers visible – jalf Jan 17 '11 at 04:01
  • Gotta rush to the college now. Will post a separate and more comprehensive answer after returning. :-) – Prasoon Saurav Jan 17 '11 at 04:04
  • possible duplicate of [cout << order of call to functions it prints?](http://stackoverflow.com/questions/2129230/cout-order-of-call-to-functions-it-prints) [The result of int c=0; cout< – CB Bailey Jan 17 '11 at 07:52
  • @linjunhalida: There's still value in basic research. – John Dibling Jan 17 '11 at 13:24

6 Answers6

22

I think if you read exactly what that standard quote says, the first case won't be well-defined:

When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body

What this tells us is not that "the only thing that can happen after the arguments for a function have been evaluated is the actual function call", but simply that there is a sequence point at some point after the evaluation of arguments finishes, and before the function call.

But if you imagine a case like this:

foo(X).bar(Y)

the only guarantee this gives us is that:

  • X is evaluated before the call to foo, and
  • Y is evaluated before the call to bar.

But an order such as this would still be possible:

  1. evaluate X
  2. evalute Y
  3. (sequence point separating X from foo call)
  4. call foo
  5. (sequence point separating Y from bar call)
  6. call bar

and of course, we could also swap around the first two items, evaluating Y before X. Why not? The standard only requires that the arguments for a function are fully evaluated before the first statement of the function body, and the above sequences satisfy that requirement.

That's my interpretation, at least. It doesn't seem to say that nothing else may occur between argument evaluation and function body -- just that those two are separated by a sequence point.

jalf
  • 243,077
  • 51
  • 345
  • 550
  • @jalf : your interpretation seems correct, but if you say "that is the only interpretation".. then that is not convincing to me, as of now.. – Nawaz Jan 17 '11 at 03:57
  • 2
    @Nawaz: maybe, but remember that if the standard doesn't say otherwise, then it is undefined behavior. If my interpretation is *possible*, if we can't find anything that rules it out then it is "by default" undefined. If the code is well-defined, then there must be something in the standard, in 1.9.17 or elsewhere, which contradicts my interpretation. – jalf Jan 17 '11 at 04:10
  • @jalf : I agree. +1 for making this note... :-) – Nawaz Jan 17 '11 at 04:11
  • The thing is that it can get even wackier than you imagine here. If X and Y are non trivial (ie have sub expressions). The order of evaluation of sub expressions from X and Y are unrelated and as such can be interleaved (just for extra fun). – Martin York Jan 17 '11 at 04:15
  • @jalf : one point: if X and Y evaluated before any function call, and suppose this is what Section 1.9.17 from the Standard means, don't you think it's well-defined? It doesn't really matter if X is evaluated before Y or, Y before X..... I mean, as long as X and Y both evaluated before any function call, it's well-defined? Also suppose, X and Y are not any function call that alter some global variables or something... just assume `int X, Y;` or suchlikes! – Nawaz Jan 17 '11 at 05:45
  • 2
    @Nawaz: you really seem to want this to be defined... are you losing a bet at work or something? ;-) – Tony Delroy Jan 17 '11 at 06:09
  • @Tony : Neither am I losing any bet :P.. nor am I going to write this in production code. I'm just exploring such things nowadays in my spare time. :-) – Nawaz Jan 17 '11 at 06:12
  • @Nawaz: `X` and `Y` are just placeholders for "whatever expressions are placed as arguments". As in your own example, `X` could be `++k` and `Y` could be `k`, and then the order becomes very important. – jalf Jan 17 '11 at 14:21
12

This depends on how Sun is defined. The following is well-defined

struct A {
  A &Fun(int);
  A &Gun(int);
  A &Sun(int&);
  A &Tun();
};

void g() {
  A someInstance;
  int k = 0;
  someInstance.Fun(++k).Gun(10).Sun(k).Tun();
}

If you change the parameter type of Sun to int, it becomes undefined. Let's draw a tree of the version taking an int.

                     <eval body of Fun>
                             |
                             % // pre-call sequence point
                             | 
 { S(increment, k) }  <-  E(++x) 
                             |     
                      E(Fun(++k).Gun(10))
                             |
                      .------+-----.       .-- V(k)--%--<eval body of Sun>
                     /              \     /
                   E(Fun(++k).Gun(10).Sun(k))
                              |
                    .---------+---------. 
                   /                     \ 
                 E(Fun(++k).Gun(10).Sun(k).Tun())
                              |
                              % // full-expression sequence point

As can be seen, we have a read of k (designated by V(k)) and a side-effect on k (at the very top) that are not separated by a sequence point: In this expression, relative to each other sub-expression, there is no sequence point at all. The very bottom % signifies the full-expression sequence point.

Community
  • 1
  • 1
Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
  • "depends... The following is well-defined"... because the k value seen by Sun is whatever is current when Sun runs, rather than being a snapshot prepared beforehand. But, that illustrates a way of avoiding the problem rather than providing insight into the problem. I'd argue that whether the question's scenario is defined doesn't "depend" - it is simply undefined. – Tony Delroy Jan 18 '11 at 02:58
  • @Tony you are not making any sense to me. – Johannes Schaub - litb Jan 18 '11 at 08:29
  • @Johannes : I liked this post. +1. By the way, I think, you wanted to say *"This depends on how **Sun** is defined"*, instead of *"This depends on how **Gun** is defined"*. Because only then it makes sense to me. Am I right? – Nawaz Jan 18 '11 at 12:21
  • @Johannes : That means, if `k` is user-defined type (say, of type `Index` as defined [here](http://stackoverflow.com/questions/4638364/undefined-behavior-and-sequence-points-reloaded)), then the chain function calls will be well-defined. Right? Please see the **Case 2** [here](http://stackoverflow.com/questions/4638364/undefined-behavior-and-sequence-points-reloaded/4638718#4638718) – Nawaz Jan 18 '11 at 12:29
  • I'm accepting this as "accepted answer" to my question, as it gives me concrete answer, with more insight, especially points like "This depends on how Sun() is defined" and tree visualization, are good for better understanding. Other answers are also good for learning purpose. I appreciate them as well. – Nawaz Jan 18 '11 at 14:30
  • @Johannes: ok - another attempt to explain: using the reference avoids the problem of when the increment is done, it doesn't mean that the ordering/sequencing of such things depends on anything. – Tony Delroy Jan 19 '11 at 01:31
  • 1
    @Tony I didn't say that the ordering/sequencing of the `int` version depends on anything. I said: "Is this code well-defined?" -> "It depends". – Johannes Schaub - litb Jan 19 '11 at 11:25
  • Wao. +1 just for the ascii art :) Joking :) good explanation as ever ! – neuro Feb 22 '11 at 08:54
  • @JohannesSchaub-litb I had a similar question like Nawaz (http://stackoverflow.com/questions/21103085/sequence-points-for-class-operators) and came to this point. Don't know if someone is interested anymore. But: in your diagram, between `S(increment, k)` and `V(k)` is `E(Fun(++k).Gun(10))`. From there must go a branch `--%--`, isn't it? So could it be, that the main point is, that this `%`-beginning branch is not lying in the same "plane" as your whole diagram and for this not separating `S(increment, k)` and `V(k)` unlike if it where `obj.Fun(++k);obj.Gun(10);obj.Sun(k);`? – mb84 Jan 14 '14 at 01:27
10

This is undefined behavior, because the value of k is being both modified and read in the same expression, without an intervening sequence point. See the excellent long answer to this question.

The quote from 1.9.17 tells you that all function arguments are evaluated before the body of the function is called, but doesn't say anything about the relative order of evaluation of arguments to different function calls within the same expression -- no guarantee that "++k Fun() is evaluated before k in Sun()".

eat(++k);drink(10);sleep(k);

is different because the ; is a sequence point, so the order of evaluation is well-defined.

Community
  • 1
  • 1
David Gelhar
  • 27,873
  • 3
  • 67
  • 84
8

As a little test, consider:

#include <iostream>

struct X
{
    const X& f(int n) const
    {
        std::cout << n << '\n';
        return *this;
    }
};

int main()
{
    int n = 1;

    X x;

    x.f(++n).f(++n).f(++n).f(++n);
}

I run this with gcc 3.4.6 and no optimisation and get:

5
4
3
2

...with -O3...

2
3
4
5

So, either that version of 3.4.6 had a major bug (which is a bit hard to believe), or the sequence is undefined as Philip Potter suggested. (GCC 4.1.1 with/without -O3 produced 5, 5, 5, 5.)

EDIT - my summary of the discussion in comments below:

  • 3.4.6 really might have had a bug (well, yes)
  • many newer compilers happen to produce 5/5/5/5... is that a defined behaviour?
    • probably not, as it corresponds to all increment side effects being "actioned" before any of the function calls are made, which is not a behaviour that anyone here has suggested could be guaranteed by the Standard
  • this isn't a very good approach to investigating the Standard's requirements (particularly with an older compiler like 3.4.6): agreed, but it's a useful sanity check
Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • 2
    +1 excellent empirical answer. I didn't expect it would be so easy to find differing output. – luqui Jan 17 '11 at 03:39
  • @luqui: it's a handy habit - can't confirm if it's safe, but you can often quickly see that it's not... :-) – Tony Delroy Jan 17 '11 at 03:42
  • @Tony : I get `5 5 5 5` with **gcc 4.5.0**. So sure **gcc 3.4.6** has bug :P – Nawaz Jan 17 '11 at 03:42
  • 2
    @Tony : why do you get upvotes? :P this post confirms nothing :D – Nawaz Jan 17 '11 at 03:45
  • 1
    @Tony : I would suggest that show the detailed output with gcc 4.1.1 before gcc 3.4.6, as people are getting wrong impressions :|..they tend to think that sequence is undefined which doesn't seem so with **gcc 4.1.1** and above! – Nawaz Jan 17 '11 at 03:48
  • @Nawaz: There are two possibilities: that sometime before/at gcc version 4.1.1 they fixed the output to consistently produce 5,5,5,5, or - and in my opinion this is much more likely - the behaviour is undefined and it just happens to have changed due to the changes in compiler internals/optimisation. Either way, if versions of GCC that are recent enough to still be in production use (as 3.4.6 is here) aren't consistent, it's best to avoid writing code that depends on this. (It's worth noting that 5,5,5,5 is not consistent with the sequence point rationale you have for predictable behaviour) – Tony Delroy Jan 17 '11 at 03:53
  • I get 5 5 5 5 in Visual Studio 2010 with optimizations disabled. – ThomasMcLeod Jan 17 '11 at 03:57
  • Tony: GCC has, in various versions, had plenty of serious bugs. Of course for any given single version, it's unlikely that you come across it, but it does undermine its reliability as a device for measuring "undefinedness". In any case, for someone interesting in *knowing* whether this is undefined, the statistical likelihood of GCC being buggy is irrelevant. The likelihood is non-zero, which means it doesn't really tell us anything. – jalf Jan 17 '11 at 03:57
  • 1
    @Nawaz "version X of compiler Y happens to produce the output I expect" does *not* equate to "this is not undefined". – David Gelhar Jan 17 '11 at 03:58
  • 1
    @David: no, and @Nawaz's point was different: simply that "you just used an old version of GCC to "prove" that this is very likely undefined. Here's a more recent version which shows otherwise". That doesn't mean the new version is correct, or that it is indicative of the code being well-defined, but simply that we can't really rely on GCC to tell us what is and what isn't defined. – jalf Jan 17 '11 at 04:00
  • @David: I agree. But I didn't mean that you know :D – Nawaz Jan 17 '11 at 04:00
  • 1
    @jalf: thing is, the recent version doesn't show otherwise... it still doesn't evaluate the arguments on a left-to-right basis ala Nawaz's "eat(++k);drink(10);sleep(k);". It still suggests undefined behaviour, as I don't think anyone has suggests - or quoted anything from the Standard supporting - that all arguments must be evaluated before any ".-chained" functions are called. – Tony Delroy Jan 17 '11 at 04:15
  • 1
    Actually, looking at what compilers do in an attempt to figure out what the Standard says is akin to trying to figure out what the 10 Commandments are by examining actual human behavior: "Hmm... I see killing, stealing, adultery and coveting of neighbors' asses -- guess the Standard says those are ok". :-) – David Gelhar Jan 17 '11 at 04:22
  • @David: it's imperfect for sure, but I was trying to get out a quick counter-example to Prasoon's over-confident "yes, it's defined behaviour for sure" answer, and as is still evident it's a difficult thing to prove quickly and convincingly based on the Standard: I believe the current status is "well, we can't find anything in the Standard to say it should be defined", which is valid but inconclusive short of an exhaustive search. :-) – Tony Delroy Jan 17 '11 at 04:38
  • 2
    @Tony, @Nawaz: Getting 5 5 5 5, 1 2 3 4 5, 5 4 3 2 1. There are all consistent with it begin undefined. – Martin York Jan 17 '11 at 05:25
  • can someone please explain the `5,5,5,5` result? I can't see how's it possible. – davka Jan 17 '11 at 09:06
  • @davka: because there are no sequence points between the calls or increments, the compiler is free to (and happens to) perform all the increments on n before it starts thinking about the chained function calls, it then reads n knowing any side effects have already been handled but passes 5 to each of the calls. – Tony Delroy Jan 17 '11 at 09:08
  • Thought so, but then, I don't see what `evaluation of all function arguments` means. Shouldn't eVALUation produce a **value** (as opposed to a variable holding that value)? Regardless of at what point in time the **argument evaluation** happens, the result should be (IMHO) "the **value** for this parameter will be X" – davka Jan 17 '11 at 09:24
  • @davka: when you get down to the detail, that's effectively what the compiler is doing - it's remembering "I've got the value for that ready in this CPU register". Problem is that before it uses it, the register gets incremented again. The compiler can track this - hence GCC can display a warning if you ask for them - but it's undefined behaviour because the compiler's not required to guard against this or handle it. That preserves the freedom to optimise more aggressively, while potentially keeping the compiler simple and fast (if it doesn't choose to warn / be safe instead). – Tony Delroy Jan 17 '11 at 09:34
  • ok, so effectively you are saying that `evaluation of all function arguments` does not require that the values produced by the evaluation remain fixed until the function is called, and that this is implementation-dependent on how these values are stored. correct? – davka Jan 17 '11 at 09:48
  • @davka: correct. More specifically, the compiler has to store them in a way that works as required for Standards-compliant code, but if the code has undefined behaviour then all bets are off. – Tony Delroy Jan 17 '11 at 09:57
1

I know that the behavior of compilers cannot really prove anything, but I thought it would be interesting to check out what the internal representation of a compiler would give (still a bit higher level than assembly inspection).

I've used the Clang/LLVM online demo with this code:

#include <stdio.h>
#include <stdlib.h>

struct X
{
  X const& f(int i) const
  {
    printf("%d\n", i);
    return *this;
  }
};

int main(int argc, char **argv) {
  int i = 0;
  X x;
  x.f(++i).f(++i).f(++i);         // line 16
}

And compiled with the standard optimizations (in C++ mode), it gave:

/tmp/webcompile/_13371_0.cc: In function 'int main(int, char**)':
/tmp/webcompile/_13371_0.cc:16: warning: operation on 'i' may be undefined

which I did find interesting (did any other compiler warned about this ? Comeau online did not)


As an aside it also produced the following Intermediate Representation (scroll to the right):

@.str = private constant [4 x i8] c"%d\0A\00", align 1 ; <[4 x i8]*> [#uses=1]

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind {
entry:
  %0 = tail call i32 (i8*, ...)* @printf(i8* noalias getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 3) nounwind ; <i32> [#uses=0]
                                                                                                             ^^^^^
  %1 = tail call i32 (i8*, ...)* @printf(i8* noalias getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 3) nounwind ; <i32> [#uses=0]
                                                                                                             ^^^^^
  %2 = tail call i32 (i8*, ...)* @printf(i8* noalias getelementptr inbounds ([4 x i8]* @.str, i64 0, i64 0), i32 3) nounwind ; <i32> [#uses=0]
                                                                                                             ^^^^^
  ret i32 0
}

Apparently, Clang behaves like gcc 4.x.x does and first evaluates all arguments before performing any function call.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • A variation on that I tried was initialising i with argc... then it can't work out the result at compile time, but you should see the three increments - perhaps rolled into something akin to `+= 3`.... – Tony Delroy Jan 17 '11 at 09:53
  • @Tony: Good variation, it effectively circumvent constant folding to a degree. – Matthieu M. Jan 17 '11 at 14:13
0

The second case is certainly well-defined. A string of tokens that ends with a semicolon is an atomic statement in C++. Each statement is parsed, processed and completed before the next statement is begun.

ThomasMcLeod
  • 7,603
  • 4
  • 42
  • 80
  • Well... all statements are parsed before anything is run, usually. But I think I get what you mean, semicolon is a pretty surefire way to sequence actions. – luqui Jan 17 '11 at 03:29
  • I didn't mean that statements are parsed at runtime. From the point of view of the program state - the value of all variables - each semicolon defines a self-contained unit of execution, so that each unit of execution (with all previous units) completely defines program state at the end of the unit. – ThomasMcLeod Jan 17 '11 at 03:41