How to normalize benchmark results to obtain distribution of ratios correctly?

Question

To give a bit of the context, I am measuring the performance of virtual machines (VMs), or systems software in general, and usually want to compare different optimizations for performance problem. Performance is measured in absolute runtime for a number of benchmarks, and usually for a number of configurations of a VM variating over used number of CPU cores, different benchmark parameters, etc. To get reliable results, each configuration is measure like 100 times. Thus, I end up with quite a number of measurements for all kind of different parameters where I am usually interested in the speedup for all of them, comparing the VM with and the VM without a certain optimization.

What I currently do is to pick one specific series of measurements. Lets say the measurements for a VM with and without optimization (VM-norm/VM-opt) running benchmark A, on 1 core.

Since I want to compare the results of the different benchmarks and number of cores, I can not use absolute runtime, but need to normalize it somehow. Thus, I pair up the 100 measurements for benchmark A on 1 core for VM-norm with the corresponding 100 measurements of VM-opt to calculate the VM-opt/VM-norm ratios.

When I do that taking the measurements just in the order I got them, I obviously have quite a high variation in my 100 resulting VM-opt/VM-norm ratios. So, I thought, ok, let's assume the variation in my measurements come from non-deterministic effects and the same effects cause variation in the same way for VM-opt and VM-norm. So, naively, it should be ok to sort the measurements before pairing them up. And, as expected, that reduces the variation of course.

However, my half-knowledge tells me that is not the best way and perhaps not even correct. Since I am eventually interested in the distribution of those ratios, to visualize them with beanplots, a colleague suggested to use the cartesian product instead of pairing sorted measurements. That sounds like it would account better for the random nature of two arbitrary measurements paired up for comparison. But, I am still wondering what a statistician would suggest for such a problem.

In the end, I am really interested to plot the distribution of ratios with R as bean or violin plots. Simple boxplots, or just mean+stddev tell me too few about what is going on. These distributions usually point at artifacts that are produced by the complex interaction on these much to complex computers, and that's what I am interested in.

Any pointers to approaches of how to work with and how to produce such ratios in a correct way a very welcome.

PS: This is a repost, the original was posted at https://stats.stackexchange.com/questions/15947/how-to-normalize-benchmark-results-to-obtain-distribution-of-ratios-correctly

It seems your question has two (or more) components: 1: What statistical methods could be used here, 2: How can I visualize this data? I was about to answer the statistical question when I got to the visualization question. Can you refine this? As a visualization answer is already given, then perhaps splitting off the statistical question would be better. Maybe the statistics.SE question should be revised in this manner. — Iterator, Sep 26 '11 at 01:46

score 0 · Answer 1 · edited May 23 '17 at 10:30

I found it puzzling that you got such a minimal response on "Cross Validated". This does not seem like a specific R question, but rather a request for how to design an analysis. Perhaps the audience there thought you were asking too broad a question, but if that is the case then the [R] forum is even worse, since we generally tackle problems where data is actually provided. We deal with the requests for implementation construction in our language. I agree that violin plots are preferred to boxplots for the examination of distributions (when there is sufficient data and I am not sure that 100 samples per group makes the grade in that instance), but in any case that means the "R answer" is that you just need to refer to the proper R help page:

library(lattice)
?xyplot
?panel.violin

Further comments would require more details and preferably some data examples constructed in R. You may want to refer to the page where "great question design is outlined".

One further graphical method: If you are interested in the ratios of two paired variates but do not want to "commit" to just x/y, then you can examine them by plotting and then plotting iso-ratio lines by repeatedly using abline(a=0, b= ). I think 100 samples is pretty "thin" for doing density estimates, but there are 2d density methods if you can gather more data.

Thanks, but I am not having trouble with using R. (See for an example: http://www.stefan-marr.de/2011/09/using-r-to-understand-benchmarking-results/) I am having a problem with the underlying statistics knowledge. And my limited knowledge in that area does not provide me with the right google keywords. My question is only about how to handle ratios, and how to calculate them correctly. — smarr, Sep 25 '11 at 20:10
If this is not a question about how to code a specific problem in R then you are in the wrong place. — IRTFM, Sep 25 '11 at 20:19
I just had hoped that some of the R users can point me at the relevant statistical background for such a, from my perspective, basic problem. What would be a better place to ask such a question? stats.se..com wasn't helpful so far either. — smarr, Sep 25 '11 at 20:45

How to normalize benchmark results to obtain distribution of ratios correctly?

1 Answers1