Avoid duplicating text. Don't use params unless you convert your input/outputs to wildcards + extentions. Otherwise you're left with a rule that is hard to maintain.
input:
"{pathDIR}/{genome}.txt"
output:
"{pathDIR}/{genome}.msh"
params:
dir: '{pathDIR}/{genome}'
Otherwise, use Python's slice notation.
I couldn't seem to get slice notation to work in the params using the output wildcard. Here it is in the run directive.
from subprocess import call
rule sketch:
input:
'out/genomes.txt'
output:
'out/genomes.msh'
run:
callString="mash sketch -l " + str(input) + " -k 31 -s 100000 -o " + str(output)[:-4]
print(callString)
call(callString, shell=True)
Python underlies Snakemake. I prefer the "run" directive over the "shell" directive because I find it really unlocks a lot of that beautiful Python functionality. The accessing of params and various things are slightly different that with the "shell" directive.
E.g.
callString=config["mpileup_samtoolsProg"] + ' view -bh -F ' + str(config["bitFlag"]) + ' ' + str(input.inputBAM) + ' ' + wildcards.chrB2M[1:]
A bit of a snippet of J.K. using the run directive.
All of the rules in my modules pretty much use the run directive