6

Shake builds things in parallel when possible, but what happens if an individual build step is itself parallelizable? For example I'm running BLAST commands. Each command compares two species' genomes. Several comparisons could be run in parallel, but there's also a flag to split a comparison into N chunks and run those in parallel. Do I need to pick one way of splitting the jobs up and stick with it, or can I tell Shake "Use N threads overall, and by the way each of these specific tasks takes up N threads on its own"?

(This comes up when comparing many small bacterial genomes and a few bigger eukaryotic ones)

EDIT: the question can be simplified to "how to tell how many Shake threads are currently running/queued from within Shake?"

jefdaj
  • 2,025
  • 2
  • 21
  • 33
  • Like Make and most (all?) comparable build systems, Shake as such doesn't know, care, or influence how many threads are used by some process it invokes. If that `BLAST` program needs to be told through a flag to parallelise, then you need to pass that flag from Shake. Of course, whether this is a good idea is another question – if you already run many of these commands in parallel, you likely won't win much with finer-grained parallelism. Is that your question – how to tell from within Shake how many commands are running, and thus decide how parallel each should be? – leftaroundabout Jun 19 '17 at 09:45
  • Thanks that clarifies it a bit. I guess that is my question: how to set `--n_threads` for a new process based on the number already running? Or better yet the number queued to be run. – jefdaj Jun 19 '17 at 17:08

1 Answers1

0

No, but there is a ticket to add it: https://github.com/ndmitchell/shake/issues/603

Neil Mitchell
  • 9,090
  • 1
  • 27
  • 85