In snakemake I want to prevent to run out of memory, this is in principle possible by specify memory limits per rule, i.e.:
rule a:
input: ...
output: ...
resources:
mem_mb=100
shell:
"..."
I was wondering on best-practices on how to work out sensible values. For the sake of argument, say the input size is constant and independent of threads, so each run is expected to have a constant upper limit.
A potential approach is to benchmark this rule and take values from there (+ some sort of safety margin). An example output of such a benchmark looks like this:
s h:m:s max_rss max_vms max_uss max_pss io_in io_out mean_load cpu_time
92.3651 0:01:32 209.23 15008.50 136.93 148.72 0.05 0.22 302.65 279.61
The values are in a related thread here, but I was wondering if this is the right approach, and if so, which value to base it on. I.e. if max_mvs
(Maximum “Virtual Memory Size”) was a good proxy, would mem_mb
have be 15008
?