0

I want to boot the Linux kernel in full system (FS) mode with a lightweight CPU to save time, make a checkpoint after boot finishes, and then restore the checkpoint with a more detailed CPU to study a benchmark, as mentioned at: http://gem5.org/Checkpoints

However, when I tried to use -r 1 --restore-with-cpu= I cannot observe cycle differences between the new and old CPU.

The measure I'm looking at is how cache sizes affect the number of cycles that a benchmark takes to run.

The setup I'm using is described in detail at: Why doesn't the Linux kernel see the cache sizes in the gem5 emulator in full system mode? I'm looking at the cycle counts because I can't see cache sizes directly with the Linux kernel currently.

For example, if I boot the Linux kernel from scratch with the detailed and slow HPI model with command (excerpt):

./build/ARM/gem5.opt --cpu-type=HPI --caches --l1d_size=1024 --l1i_size=1024 --l2cache --l2_size=1024 --l3_size=1024 

and then change cache sizes, the benchmark does get faster as the cache sizes get better as expected.

However, if I first boot without --cpu-type=HPI, which uses the faster AtomicSimpleCPU model:

./build/ARM/gem5.opt --caches --l1d_size=1024 --l1i_size=1024 --l2cache --l2_size=1024 --l3_size=1024

and then I create the checkpoint with m5 checkpoint and try to restore the faster CPU:

./build/ARM/gem5.opt --restore-with-cpu=HPI -r 1  --caches --l1d_size=1024 --l1i_size=1024 --l2cache --l2_size=1024 --l3_size=1024

then changing the cache sizes makes no difference: I always get the same cycle counts as I do for the AtomicSimpleCPU, indicating that the modified restore was not successful.

Analogous for x86 if I try to switch from AtomicSimpleCPU to DerivO3CPU.

Related old thread on the mailing list: http://thread.gmane.org/gmane.comp.emulators.m5.users/14395

Tested at: fbe63074e3a8128bdbe1a5e8f6509c565a3abbd4

Ciro Santilli
  • 3,693
  • 1
  • 18
  • 44

3 Answers3

1

--cpu-type= affected the restore, but --restore-with-cpu= did not

I am not sure why that is, but I have empirically verified that if I do:

-r 1 --cpu-type=HPI

then as expected the cache size options start to affect cycle counts: larger caches leads to less cycles.

Also keep in mind that caches don't affect AtomicSimpleCPU much, and there is not much point in having them.

TODO so what is the point of --restore-with-cpu= vs --cpu-type if it didn't seem to do anything on my tests?

Except confuse me, since if --cpu-type != --restore-with-cpu, then the cycle count appears under system.switch_cpus.numCycles instead of system.cpu.numCycles.

I believe this is what is going on (yet untested):

  • switch_cpu contains stats for the CPU you switched to
  • when you set --restore-with-cpu= != --cpu-type, it thinks you have already switched CPUs from the start
  • --restore-with-cpu has no effect on the initial CPU. It only matters for options that switch the CPU during the run itself, e.g. --fast-forward and --repeat_switch. This is where you will see both cpu and switch_cpu data get filled up.

TODO: also, if I use or remove --restore-with-cpu=, there is a small 1% cycle difference. But why is there a difference at all? AtomicSimpleCPU cycle count is completely different, so it must not be that it is falling back to it.

--cpu-type= vs --restore-with-cpu= showed up in fs.py --fast-forward: https://www.mail-archive.com/gem5-users@gem5.org/msg17418.html

Confirm what is happening with logging

One good sanity that the CPU want want is being used, is to enable some logging as shown at: https://github.com/cirosantilli/linux-kernel-module-cheat/tree/bab029f60656913b5dea629a220ae593cc16147d#gem5-restore-checkpoint-with-a-different-cpu e.g.:

--debug-flags ExecAll,FmtFlag,O3CPU,SimpleCPU

and shen see if you start to get O3 messages rather than SimpleCPU ones.

Ciro Santilli
  • 3,693
  • 1
  • 18
  • 44
1

From reading through some of the code I believe that --restore-with-cpu is specifically for the case when your checkpoint was created using a CPU model that isn't the AtomicCPU. The scripts assume that AtomicCPU was used to create the checkpoint. I think when restoring it's important to have the same cpu model as the system was checkpointed with, if you give another model with --cpu-type then it switches to that model after the restore operation as completed.

http://gem5.org/Checkpoints#Sampling has some (small) detail on switching cpu models

1

First, for your question, I don't see how cycle count being an indication of the restoration result. The cycle being restored should be the same regardless of what CPU you want to switch. Switching does not change the past cycles. When creating a checkpoint, you basically freeze the simulation at that state. And switching CPU simply changes all the parameter of the CPU while keeping the ticks unchanged. It is like hot swapping a CPU.

To correctly verify the restoration, you should keep a copy of config.json before restoration and compare it with the new one after restoration. For X86 case, I could find string AtomicSimpleCPU there only before restore.

Furthermore, only --cpu-type will determine the CPU being switched. But it does not make --restore-with-cpu useless. In fact, --restore-with-cpu should only be used when you boot up the system with a CPU other than AtomicSimpleCPU. Most people want to boot up the system with AtomicSimpleCPU and make a checkpoint since it is faster. But if you mistakenly boot up using DerivO3CPU, to restore this particular checkpoint, you have to configure --restore-with-cpu to DerivO3CPU. Otherwise, it will fail.

Yi Shen
  • 108
  • 1
  • 10
  • OK, so you mean that `--restore-with-cpu` indicates what was the old CPU, is that it? Why not just store that info inside the checkpoint itself? – Ciro Santilli Aug 29 '18 at 05:43
  • I'm analyzing the cycle count after checkpoint restore, from question: "checkpoint after boot finishes, and then restore the checkpoint with a more detailed CPU to study a benchmark". – Ciro Santilli Aug 29 '18 at 05:43
  • I have now booted with `--cpu-type HPI` and restored with `-r 1 --cpu-type HPI` and no `--restore-with-cpu` and it seemed to work as in my answer https://stackoverflow.com/a/49673265/9160762 , what do you mean by "Otherwise, it will fail" more precisely? – Ciro Santilli Aug 29 '18 at 06:49
  • @CiroSantilli, I think this info is stored. Although in config.json but not in checkpoint. I think it is the ` fs.py ` chooses not to implement the feature you suggested. You should be able to write your own script that support this feature. – Yi Shen Aug 29 '18 at 12:42
  • The documented way to check CPU is by config.json. After the successful switch, you should be able to find "switch_cpus" in the json file. – Yi Shen Aug 29 '18 at 12:55
  • Last time I check, I could not restore a system booted by O3. My best guesses for your situation are: 1. It is ISA specific, probably HPI is not that different from Atomic. 2. It is CPU specific, I could have used some special configs. 3. Your checkpoint 1 is actually an Atomic CPU. Did you have more checkpoints there? – Yi Shen Aug 29 '18 at 13:00