The default commands are normally the fastest available (and therefore lowest simulation accuracy).
gem5.fast
build
A .fast
build can run about 20% faster without losing simulation accuracy by disabling some debug related macros:
scons -j `nproc` build/ARM/gem5.fast
build/ARM/gem5.fast configs/example/se.py --cpu-type=TimingSimpleCPU \
-c test/test-progs/hello/src/my_binary
The speedup is achieved by:
so in general .fast
is not worth it if you are developing the simulator, but only when you have done any patches you may have, and just need to run hundreds of simulations as fast as possible with different parameters.
TODO it would be good to benchmark which of the above changes matters the most for runtime, and if the link time is actually significantly slowed down by LTO.
gem5 performance profiling analysis
I'm not aware if a proper performance profiling of gem5 has ever been done to access which parts of the simulation are slow and if there is any way to improve it easily. Someone has to do that at some point and post it at: https://gem5.atlassian.net/browse/GEM5
Options that reduce simulation accuracy
Simulation would also be faster and with lower accuracy without --cpu-type=TimingSimpleCPU
:
build/ARM/gem5.opt configs/example/se.py -c test/test-progs/hello/src/my_binary
which uses an even simpler memory model AtomicSimpleCPU
.
Other lower accuracy but faster options include:
- KVM, but support is not perfect as of 2020, and you need an ARM host to run the simulation on
- Gabe's FastModel integration that is getting merged as of 2020, but it requires a FastModel license from ARM, which I think is too expensive for individuals
Also if someone were to implement binary translation in gem5, which is how QEMU goes fast, then that would be an amazing option.
Related
Gem5 system requirements for decent performance