I work in a very regulated environment where we need to be able to produce identical binary input give the same source code every time be build out products. We currently use an ancient version of g++ that has been patched to not write anything like a date/time in the resulting binaries that would change from build to build, but I would like to update to g++ 4.7.2. Does anyone know of a patch, or have suggestions of what I need to look for to take two identical pieces of source code and produce identical binary outputs?
3 Answers
The Debian Reproducible builds project attempts to standardize Debian packages byte-by-byte, and has received a Linux Foundation grant in 2016.
While this may include more than compilation, you should have a look at it.
It also pointed me to this article, which adds the following points to what @Employed said:
- put the source in a fixed folder (e.g.
/tmp/build
) to deal with__FILE__
- for
__DATE__
,__TIME__
,__TIMESTAMP__
:- libfaketime : https://github.com/wolfcw/libfaketime
- override those macros with
-D
-Wdate-time
or-Werror=date-time
: warn or fail if either__TIME__
,__DATE__
or__TIMESTAMP__
are used. The Linux kernel 4.4 uses it by default.
- use the
D
flag withar
, or use https://github.com/nh2/ar-timestamp-wiper/tree/master to wipe stamps -fno-guess-branch-probability
: older manual versions say it is a source of non-determinism, but not anymore. Not sure if this is covered by-frandom-seed
or not.
Buildroot has a BR2_REPRODUCIBLE
option which may give some ideas on the package level, but it is far from complete at this point.
Related threads:

- 3,782
- 4
- 16
- 33

- 347,512
- 102
- 1,199
- 985
-
A corollary question is: if I didn't take any measure to have deterministic builds, is there a chance I could find the date at which a binary was produced ? – Johan Boulé Dec 18 '20 at 11:22
-
@JohanBoulé I would guess only if `__DATE__`, `__TIME__` or `__TIMESTAMP__` were used, more specific question: https://stackoverflow.com/questions/29385996/how-can-i-find-out-the-date-of-when-my-source-code-was-compiled – Ciro Santilli OurBigBook.com Dec 18 '20 at 12:20
-
Thanks. I came across another interesting bit called the .note.gnu.build-id ELF section. There are situations where you regret not to have this kind of information. So, for the next time, we'll plan ahead and put global strings constants in the binaries. Anyway, I'm off topic, "anti-topic" we could say, but paradoxically, if we had deterministic builds, we could rebuild the binaries from each of our SCM commit until we find which one matches with the unknown binary our client has, and bingo, we'd then know which source version corresponds. – Johan Boulé Dec 18 '20 at 17:43
We also depend on bit-identical rebuilds, and are using gcc-4.7.x.
Besides setting PWD=/proc/self/cwd
and using -frandom-seed=<input-file-name>
, there are a handful of patches, which can be found in svn://gcc.gnu.org/svn/gcc/branches/google/gcc-4_7
branch.

- 199,314
- 34
- 295
- 362
-
1
-
6@StevenBehnke We build with debug info. On Linux, GCC records `PWD` (which the shell sets to current working directory) as the current compilation directory. Since we want builds to be bit-identical regardless of which directory the build was executed in, we set `PWD` to predictable value. – Employed Russian Feb 02 '13 at 04:47
-
3What are the main applications of randomness in GCC? – Ciro Santilli OurBigBook.com May 16 '15 at 23:09
Use of the 'DATE' macro makes the build non-deterministic

- 182
- 1
- 8
-
4Hello, is this different from `__DATE__` I've mentioned? https://stackoverflow.com/a/31019307/895245 – Ciro Santilli OurBigBook.com Jul 04 '18 at 18:35