2

In a Snakemake workflow, I would like to run a rule without it triggering any of the rules that produce its input.

A sample scenario is as follows: I have a rule A that is costly and produces many output files from input files:

rule A:
  input: "{name}.in"
  output: "{name}.out"
  shell: "touch {input} {output}" #just a dummy, replace with actual costly task

A second rule B takes the output files and uploads them to a server:

rule B:
  input: "{name}.out"
  output: touch("{name}.up")
  shell: "curl -F 'data={input}' http://google.com/upload

The third rule C is just a usual all rule that acts as the terminal rule to trigger all input ones:

names = ["x1","x2","x3"] # dummy for long list
rule C:
  input: expand("{name}.up",name=names)

Assume there was an error in rule B such that the expensive rule A completed, but rule B has not.

I would like to trigger rule B and rule C only, in such a way that rule A is not.

The problem is that for some reason, rule A will always run, despite many x1.out being present. This shouldn't be the case but it is.

I'm now looking for a Snakemake CLI option that allows me to prevent rule A from being run.

I could find a CLI option --until which does exactly the opposite, it runs all rules up to a certain rule. I would like to be able to do the opposite, something like --from which starts at B and fails if inputs cannot be found.

I don't know exactly why rule A gets triggered. The input files have not been updated. Nonetheless A is run (in fact it's much more complicated, the above is simplified a lot).

In short: is there a CLI option that allows me to specify a rule that should be run, including all downstream rules, but none of the upstream rules? Or is this impossible?

Cornelius Roemer
  • 3,772
  • 1
  • 24
  • 55
  • I would try to find out why rule A is triggered since that seems to be the actual problem. For experimenting, you could use the `touch` command to make the input of rule B older than its output. See also https://stackoverflow.com/a/56807933/1114453 – dariober Aug 10 '21 at 08:03
  • @dariober I did try to find out what happened but couldn't figure it out and was hoping there was another way. It could be due to a rule being there that has no input? I touched all of the intermediary output, but it was still not enough. – Cornelius Roemer Aug 10 '21 at 13:44
  • [`--reason` flag](https://stackoverflow.com/a/52426853/3998252) would be helpful to figure out why. – Manavalan Gajapathy Aug 10 '21 at 14:38
  • 1
    `touch {input} {output}` updates the time of input. That may be the cause. – Dmitry Kuzminov Aug 10 '21 at 15:30
  • @ManavalanGajapathy I tried, it was that some input changed, can't investigate anymore since the problem was solved in another way. DmitryKuzminov: I abstracted a lot of the details away, the actual workflow is a lot more complicated. – Cornelius Roemer Aug 10 '21 at 15:35
  • try the `-r` option which prints the reasons why rules are being rerun ! (in combination with the `-n` to make a dry run of course !) – Eric C. Aug 11 '21 at 08:11

0 Answers0