Is a C++ compiler allowed to emit different machine code compiling the same program?

Question

Consider a situation. We have some specific C++ compiler, a specific set of compiler settings and a specific C++ program.

We compile that specific programs with that compiler and those settings two times, doing a "clean compile" each time.

Should the machine code emitted be the same (I don't mean timestamps and other bells and whistles, I mean only real code that will be executed) or is it allowed to vary from one compilation to another?

There was an answer by @Neil Butterworth about the same issue. IIRC, he explains why a compiler could produce different output even if everything *seems* equal. I am trying to find it :) — Khaled Alshaya, Jun 16 '10 at 14:04
if the compiler uses statistical algorithms in some cases, then yes it could produce slightly different code, ie use alternative registers or code layout. — Nick Dandoulakis, Jun 16 '10 at 14:14
Possible dupe: http://stackoverflow.com/questions/1221185/identical-build-on-different-systems — , Jun 16 '10 at 14:18
@Moron: actually not. That question specifies that function order in the produced binary differs between compiles - that's not what bothers me. — sharptooth, Jun 16 '10 at 14:33
Does the standard say anything about the machine code? No? Well, then *any* output which preserves the semantics specified in the standard is allowed. Ergo: **Yes.** — dmckee --- ex-moderator kitten, Jun 16 '10 at 15:48
Does your compiler have a setting to randomize function addresses to prevent targeted buffer overflow attacks? — Martin Beckett, Jun 16 '10 at 15:56
@Martin Beckett: I see your point, but that's not exactly what I'm asking about. Yes, functions can be located at different addresses, but I'm asking about what is inside the functions. — sharptooth, Jun 17 '10 at 05:17
@dmckee: I like this concise explanation. Shouldn't it be an answer instead of comment? — sharptooth, Jun 17 '10 at 05:18
@sharptooth - I thought you might be simply diff'ing the executables produced by 2 builds — Martin Beckett, Jun 17 '10 at 13:38
dmckee is absolutely right. However, I *despise* the word "ergo." — j_random_hacker, Jun 25 '10 at 12:49

Jerry Coffin · Accepted Answer · 2020-01-15T20:29:05.740

The C++ standard certainly doesn't say anything to prevent this from happening. In reality, however, a compiler is normally deterministic, so given identical inputs it will produce identical output.

The real question is mostly what parts of the environment it considers as its inputs -- there are a few that seem to assume characteristics of the build machine reflect characteristics of the target, and vary their output based on "inputs" that are implicit in the build environment instead of explicitly stated, such as via compiler flags. That said, even that is relatively unusual. The norm is for the output to depend on explicit inputs (input files, command line flags, etc.)

Offhand, I can only think of one fairly obvious thing that changes "spontaneously": some compilers and/or linkers embed a timestamp into their output file, so a few bytes of the output file will change from one build to the next--but this will only be in the metadata embedded in the file, not a change to the actual code that's generated.

What I would give for a non deterministic compiler that invented optimizations on the fly. — Martin York, Jun 16 '10 at 18:03

Romain Hippeau · Answer 2 · 2010-06-16T18:45:51.057

4

There is no guarantee that they will be the same. Also according to http://www.mingw.org/wiki/My_executable_is_sometimes_different

My executable is sometimes different, when I compile and recompile the same source. Is this normal?

Yes, by default, and by design, ~MinGW's GCC does not produce ConsistentOutput, unless you patch it.

EDIT: Found this post that seems to explain how to make them the same.

edited Jun 16 '10 at 18:45

answered Jun 16 '10 at 14:13

Romain Hippeau

24,113
5
60
79

1

I followed the link, but there was no explanation of why! – Martin York Jun 16 '10 at 18:04
@martin York updated my post with another link to make them the same. – Romain Hippeau Jun 16 '10 at 18:46

score 4 · Answer 3 · answered Jun 16 '10 at 15:52

According to the as-if rule in the standard, as long as a conforming program (e.g., no undefined behavior) cannot tell the difference, the compiler is allowed to do whatever it wants. In other words, as long as the program produces the same output, there is no restriction in the standard prohibiting this.

From a practical point of view, I wouldn't use a compiler that does this to build production software. I want to be able to recompile a release made two years ago (with the same compiler, etc) and produce the same machine code. I don't want to worry that the reason I can't reproduce a bug is that the compiler decided to do something slightly different today.

For a number of purposes, it's useful to have compilers whose output is completely deterministic even if it's not optimal. For example, if one is compiling code for an open-source voting machine, one should use an open-source cross compiler which will always yield bit-identical output regardless of the environment in which the compiler itself is run. If versions of that compiler which are bootstrapped from source via several independent means all produce identical code, that would pretty strongly imply that the compiler didn't hide any "gotchas" which aren't in the source. — supercat, Dec 01 '10 at 23:15

score 1 · Answer 4 · answered Jun 16 '10 at 14:07

1

I'd bet it would vary every time due to some metadata compiler writes (for instance, c# compiled dlls always vary in some bytes even if I do "build" twice in a row without changing anything). But anyways, I would never rely on that it would not vary.

answered Jun 16 '10 at 14:07

František Žiačik

7,511
1
34
59

Is a C++ compiler allowed to emit different machine code compiling the same program?

4 Answers4

Linked