Why are warmup forks useful in JMH?

Question

In short my question is:

What do forked warmups offer that within-fork warmups do not? What is the thing that makes executing forked warmups useful?

JMH conveniently allows you to fork a new JVM instance in which your benchmarks are run. This is useful because as explained in this and this question, it allows you to run your benchmakrs with a clean slate: a fresh OS process where the state of both the JVM and the OS/system has no previous decisions/observations for the process which runs the benchmarks.

In addition, JMH allows you to perform a number of initial "warmup" runs of your benchmarks in the forked process, the results of which are discarded. These are useful to let the system "settle" with decisions made during execution (OS makes memory paging decisions, JIT compiles VM code to native, etc).

There are actually though two ways this can be done: using @Warmup annotation and using the warmups parameter of the @Fork annotation. As explained in what is the difference between warmup attribute in Fork and Warmup annotation in jmh? the former executes some initial iterations within each fork, whereas the latter executes some entire forks that are discarded.

I am having trouble understanding why the forked warmups were introduced as a feature. A forked process presents a fresh challenge to the the OS/JVM removing any bias that may arise from previously executed code in the existing process. Having the forked process execute enough warmup iterations also means that any "initial optimizations" the OS/JVM perform, are given time to "settle".

So then, why would I want to run some entire forks that are discarded all together by adding @Fork(warmups = N)? I'm guessing it has to do with the OS/system being given time to settle on something (as the JVM does not stick around after each fork to use the information from the previous warm-up forks) but what is it about a process being created/torn-down a few times that helps improve results? And why would a generous @Warmup annotation not achieve the same if a single fork were to be used?

there's a paper they link https://www.ifi.uzh.ch/dam/jcr:2e51ad81-856f-4629-a6e2-67d382d337c2/fse20_author-version.pdf maybe that shows reasons why warmup forks have any effect on the following measurement forks (which then also do warmup runs to prep that fork). if there's any effect it's probably down to some os or cpu mechanisms that kick in for familiar tasks. — zapl, Oct 14 '22 at 10:25
The simplest reason is that it saves a bit time w.r.t. doing it in the same fork. — Joop Eggen, Oct 14 '22 at 10:30
@zapl Thanks for that reference. I've read about half of it so far. It's very telling that their only reference to warmup forks is in section 2 where they basically only mention that they exist, as well as that "JMH does not use warmup forks (w f ) by default". After that they never mention them again. Section 4.1 describing their approach makes it clear that they only use measurement forks. They never study the impact of potentially adding warmup forks. However, I am now looking at now at their stability function for minimizing the number for measurement fork in 4.2.2 which is relevant. — Alexandros, Oct 14 '22 at 13:04
@zapl I was hoping they try their approach with varying number of warmup forks but they do not. Section 5.3 explains that both their static and their dynamic configuration uses zero warmup-forks. I was looking to find wether they have anything definitive on the number of measurement forks but they mention nothing. They skim over the subject in section 6 where they simply mention *"JMH 1.21 ... reduced the number of forks from ten to five ... our results support reducing to five forks, which indicates that most fork-to-fork variability is captured"*. No mention for investigating <5. — Alexandros, Oct 14 '22 at 13:28
@zapl Finally found the relevant info, last paragraph in section 5: *"Generally, warmup iterations are more reduced than forks in our setup, indicating that fork-tofork variability is more present than within-fork variance ..."*. I guess what I am looking for is fork-to-fork variability reasons, as I would expect that `f=1` should always be enough when `w_i` is high enough... — Alexandros, Oct 14 '22 at 13:38

Stephen C · Answer 1 · 2022-10-14T13:28:01.453

2

I am having trouble understanding why the forked warmups were introduced as a feature.

Warmup forks can do things like:

prime directory and file caches by opening and files
prime the virtual memory system by causing dirty pages to be written and adding pages to the free pool,
prime a shared class cache (if your JVM supports this),
and so on.

These things can definitely improve benchmark consistency.

Why do it multiple times?

It is not clear to me that multiple warmup fork would help in practice ... or why. But having the ability to try it just in case it makes a difference seems like a good thing¹.

Presumably the warmup fork feature was added because at least some people wanted it / asked for it. And if you can do a single warmup fork run, an option to do it N times was probably a trivial extension.

^{1 - Or even providing it to give performance obsessed people with too much time on their hands another "tuning knob" to twiddle.}

edited Oct 14 '22 at 13:28

answered Oct 13 '22 at 23:54

Stephen C

698,415
94
811
1,216

Can you explain how these examples are things that regular warmup iterations can't address? For instance, your first example about I/O caches: what would you protect against with an extra fork, that a generous number of Warmup iterations of significant duration cannot do? I'm really sceptical about the value of forked warmups, especially when the execution environment is controlled (i.e. nothing else is running on the system performing the benchmark). – Alexandros Oct 14 '22 at 09:40
Please reread my answer. I am not claiming that an extra fork will help. All I am saying is that it costs almost nothing to provide an option that allows the user to *try* it. You are asking why they provide the option, and I have given you my explanation. It is to allow people to try it. – Stephen C Oct 14 '22 at 09:58
Thank you, but my question really then would be answered by someone who has tried it and identified that "I need to do a forked warmup in case X, because when I just fork and run immediately using only `@Warmup` bad thing Y happens which a number N of extra forks preceding the measurement run via `@Fork(wamups=N)` prevents bad thing Y from happening. (Edited comment because accidentally pressed enter in middle of typing half of it). – Alexandros Oct 14 '22 at 12:25

Why are warmup forks useful in JMH?

1 Answers1