In gem5 fe15312aae8007967812350f8cdac9ad766dcff7 (2019), the gem5.fast build already enables LTO by default, so you generally never want to use that option explicitly, but rather want just gem5.opt
.
Other things to also keep in about .fast
:
- it also removes
-g
and so you get no debug symbols. I wonder why, since that does not make runs any faster.
- it also turns on
NDEBUG
, which has the standard library effect of disabling assert
s entirely, but plus some gem5 specific effects spread throughout the code with #ifndef NDEBUG
checks
- it disables
TRACING_ON
, which makes DPRINTF and family become empty statements as seen at: src/base/trace.hh
Those effects can be seen easily at src/SConstruct
.
That option exists because the more common gem5.opt
build also uses partial linking, which in some versions of GCC was incompatible with LTO.
Therefore, as its the name suggests, --force-lto
forces the use of LTO together with partial linking, which might not be stable. That's why I recommend that you use gem5.fast
rather than touching --force-lto
.
The goal of partial linking is presumably to speed up the link step, which can easily be the bottleneck in a "change on file, rebuild, relink, test" loop, although in my experiments it is not clear that it is efficient at doing that. Today it might just be a relic from the past.
To try to speed up linking, I recommend that you try scons --gold-linker
instead, which uses the GOLD linker instead of ld. Note that this option was more noticeably effective for gem5.debug however.
I have found that gem5.fast
is generally 20% faster than gem5.opt
for Atomic CPUs.