On C/C++, basically what things are compiler-dependent?

Question

What tasks, features, executions vary with compiler? I know this code is compiler-dependent-

#include <stdio.h>
#define PRODUCT(x)(x*x)
int main()
{
int i=3,j,k;
j=PRODUCT(i++);
k=PRODUCT(++i);
printf("\n%d %d",j,k);
}

Following gives garbage in some, while fixed values in others-

#include <stdio.h>
int main()
{
int i=5,j=10;
printf("%d,%d");
}

So order of execution vary with compilers. Are such ambiguous programs eligible to be asked in exams?

Your examples exhibit undefined behavior. A compliant compiler is allowed to do anything with them. — Pascal Cuoq, Oct 11 '11 at 04:35
It's not ambiguous - it's WRONG. Using uninitialized data: wrong. Passing an incorrect #/arguments to a variadic function: wrong. It's entirely "eligible to be asked in exams" (and interviews) whether code is correct or not. But trying to understand "incorrect behavior" is like trying to understand how a car works by smashing it into a wall and observing which parts fall off. You're not likely to learn much... ;) — paulsm4, Oct 11 '11 at 04:42
@PascalCuoq, that's not correct. In the first example, there are a limited set of possible values. In the second, more modern compilers will actually give a compile error for missing parameters (something that caught me by surprise the other day) but the generated code, while ill-formed, it completely defined. — Charlie Martin, Oct 11 '11 at 04:42
@CharlieMartin: Undefined behaviour means that the behaviour of the code is not specified under the C++ standard. It does not mean that the code itself won't be defined within the translation unit, which is what I think you are getting at. — Ayjay, Oct 11 '11 at 04:46
@Ayjay you're right in terminology but wrong in application. The first example is indeterminate, not undefined: the spec requires that the generated code produce one of a known set of values. The code in the second case is undefined: what will happen cannot be predicted from the program text. — Charlie Martin, Oct 11 '11 at 05:01
@Charlie Martin "There are a limited set of possible values" -> no, the behavior is **undefined**. This phrase would be a description of **unspecified** behavior. In both the OP's examples, anything can happen. Compilers are allowed to reject the program at compilation in both cases. — Pascal Cuoq, Oct 11 '11 at 05:05
@paulsm4: Isn't that pretty much a description of particle physics research? ;) — caf, Oct 11 '11 at 05:08
@Charlie Martin "indeterminate" in the C99 standard is used to describe the contents of a memory zone. You are thinking of the word "unspecified". Unsequenced side-effects are not an example of unspecified behavior, they are an example of undefined behavior. — Pascal Cuoq, Oct 11 '11 at 05:09
FWIW reading from uninitalised variables is undefined behaviour - it can and will cause crashes on some platforms - http://stackoverflow.com/questions/1597405/what-happens-to-a-declared-uninitialized-variable-in-c-does-it-have-a-value/1597426#1597426 — Flexo, Oct 19 '11 at 19:41

bdonlan · Accepted Answer · 2011-10-11T06:33:31.753

If you want the full list, you'll need to look to the standard document. In the C standard there are two types of 'compiler-dependent' issues defined:

Implementation-defined behavior: The behavior may vary from compiler to compiler, but the compiler must provide some sort of consistent behavior, and must document this behavior. An example, straight from the standard: "An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right.". In other words, the result of -1 >> 1 may vary between compilers, but the compiler has to be consistent about it.
Undefined behavior: All bets are off. The moment you hit undefined behavior, anything - and I do mean anything can happen. Your code is a good example of this - you modify a single variable twice without an intervening sequence point (in violation of ISO/IEC 9899:1999 (E) §6.5's constraint). And in the second one, you are missing parameters (undefined behavior per §7.19.6.1/2). According to a strict reading of the standard, it is perfectly justified for the compiler to summon demons through your nose in this case.

You also need to watch out for constraint violations. Often the standard specifies things like "[main] shall be defined with a return type of int [...]" (§5.1.2.2.1/1). This is equivalent to, "If main is declared with a return type other than int, the program's behavior is undefined." (see §4.2, where the standard explicitly endorses this interpretation)

You should not be asked these questions on an exam; if you are, you should simply state that the behavior of the program is undefined (or implementation-defined). Note that some implementation-defined behavior has limits - eg, the value of sizeof(int) is implementation defined, but you know that sizeof(int) >= sizeof(short) && sizeof(int) <= sizeof(long) - so just having any implementation-defined behavior doesn't mean you can't say anything about what the program does.

It's also worth pointing out that implementation-defined behaviour is usually directly called out as such in the standard, but undefined behaviour is often simply marked by the word *"shall"* - if the standard says that your program shall meet certain constraints, then failing to meet them means that the behaviour of your program is undefined. — caf, Oct 11 '11 at 06:29

score 3 · Answer 2 · answered Oct 11 '11 at 04:35

3

The first one is actually compiler dependent because what it resolves to is (i++ * i++). The compiler is free to order those operations to its own needs because the expression doesn't have "sequence points".

Wikipedia has a good article on sequence points.

The first example is indeed the sort of thing that happens in quizzes, and the correct answer is "the result is indeterminate."

The second one is simply incorrect; you're lucky it didn't give you a segmentation fault or the like. Observe the printf: it's attempting to pull two values of the stack that haven't been pushed. The value you see printed is whatever happens to be on the stack, and if you had another function call following, it would very likely fail at that point.

answered Oct 11 '11 at 04:35

Charlie Martin

110,348
25
193
263

I do not see a difference between the second undefined behavior and the first one. They are both allowed to segfault, and to produce different results from one execution to the next. Especially in the context of an exam, I wouldn't make a distinction between them. – Pascal Cuoq Oct 11 '11 at 04:40
2

It's worse than just the _result_ being indeterminate. The behavior of the program as a whole is undefined. The compiler may reject it outright at compile time; it may fail to terminate; it may emit multiple output values; it may format the hard drive; it may launch nuclear missiles; it may even summon demons through your nose. This is what _undefined behavior_ means. None of this wishy-washy 'indeterminate value' crap :) – bdonlan Oct 11 '11 at 04:43
@PascalCuoq, you don't see it because you're mistaken in the facts about the effect. The first one is *not* supposed to segmentation fault: that's perfectly well-defined C. You just don't know whether j will be 9, 12, 0r 16. – Charlie Martin Oct 11 '11 at 05:02
@bdonlan you're mistaken. See the Wikipedia article, or read Kernighan and Ritchie, which went into it in some detail 30 years ago. – Charlie Martin Oct 11 '11 at 05:04
2

PascalCuoq is correct, the behaviour of the entire program is undefined since it modifies the value of `i` twice without an intervening sequence point. The unspecified order of evaluation only matters in cases like `f() * g()`. – caf Oct 11 '11 at 05:10
2

@Charlie Martin Thanks for the references. You may yourself be interested in the C99 standard. "If an attempt is made to modify the result of an assignment operator or to access it after the next sequence point, the behavior is undefined." 6.5.16.4 (together with 6.5.3.1.2 "See the discussions of additive operators and compound assignment for information on constraints, types, side effects, and conversions and the effects of operations on pointers.") – Pascal Cuoq Oct 11 '11 at 05:18
1

@CharlieMartin, why am I mistaken? I'm looking here at the C99 spec, not a 30-year-old outdated semi-spec... :) – bdonlan Oct 11 '11 at 05:39
Okay, so they have a bug in the C99 spec. No behavior that was well defined in C should stop being well-defined in C99. You should submit a bug report. – Charlie Martin Oct 11 '11 at 15:20

score 3 · Answer 3 · answered Oct 11 '11 at 04:38

That isn't a compiler ambiguity - your program is ill-formed.

Incrementing the same variable twice in an expression produces undefined behaviour.

If that sort of question is asked in an exam, you are well within your rights to state that and not answer the question.

To answer your questions, the major compilers (GCC, MSVC) don't differ in any really significant ways that I know of, at least until you get into the stratosphere of meta-programming techniques. They both seem to optimise about the same way and have basically the same support for C++11 features.

score 3 · Answer 4 · answered Oct 11 '11 at 04:44

Are such ambiguous programs eligible to be asked in exams?

Naturally, that depends on the person giving the exam and the exam topic. I wouldn't be surprised to see a program containing undefined behavior on an exam about the C programming language, but I'd be very surprised if the "correct" answer were anything other than "the behavior of that program is undefined."

score 3 · Answer 5 · answered Oct 11 '11 at 08:43

As you've asked about C++ as well, here's the C++11 answer (which is different from C++98 and C99):

conditionally-supported constructs (1.3.5)
implementation-defined behavior (1.3.10), obviously
implementation limits (1.3.11), such as the maximum number of function arguments.
available locales and their behavior (1.3.12, 1.4)
unspecified behavior (1.3.25)
diagnostic messages produced (1.4), not all errors require a diagnostic.
implementation details of the standard library (1.4) - this covers a lot of possible variation.
extensions (1.4)
character sets used (2.3)

Undefined behavior isn't really compiler dependent. Since anything can happen, and a compiler isn't even required to be consistent with itself, the resulting behavior cannot be said to depend on the compiler.

score 0 · Answer 6 · answered Oct 11 '11 at 04:39

Both program invoke undefined behaviour (UB).

The first program invokes UB, because the macro expands to i++ * i++ and ++i*++i respectively, and each expression invokes UB, as they both attempt to modify the object i more than once without any intervening sequenct point.
Second program invokes UB, because the format string doesn't match the arguments.

score 0 · Answer 7 · answered Oct 11 '11 at 04:39

0

Well, too many, I think one way to find them is check the C99 standard, then search for keyword undefined and implementation-defined.

answered Oct 11 '11 at 04:39

lostyzd

4,515
3
19
33

score 0 · Answer 8 · answered Oct 11 '11 at 04:44

I recommend that you get copies of the C and C++ standards (they aren't expensive to buy or you might use one of the publicly available drafts) and read them.

While reading, note all the things that are marked as unspecified behavior, undefined behavior, implementation-defined behavior, common extensions and so on.

There's this section, Annex J "Portability Issues", in the C standard that's entirely devoted to this kind of things. Similar things (including the differences between C and C++) are listed in the C++ standard as well.

The standards will give you the ultimate answers to questions of this type. Also, watch out for non-compliant (to the standard) behavior of your compiler, check its documentation too.

On C/C++, basically what things are compiler-dependent?

8 Answers8