0

This seems like a basic question but I keep getting some variation of the error: No values given for wildcard.

I have a group of 22 files named Ne-sQTL_perind.counts.gz.qqnorm_chr{#}.gz. I would like to act on them in a rule. What I have originally looks like this:

rule QTLtools_filter:
    input:
        file=expand("Ne-sQTL_perind.counts.gz.qqnorm_chr{i}.gz",i=range(1,22)),
        chk=".prepare_phen_table.chkpnt"
    output:
        expand("{input.file}.qtltools")
    message:
        "Making phenotype files QTLtools compatible..."
    shell:
        "cat {input.file} | awk '{ $4=$4\" . +\"; print $0 }' | tr " " \"\t\" | bgzip -c > {input.file}.qtltools"

However, I get the No values found for wildcare 'input', which is confusing to me, because in the docs, we have a clear example of this working with the wildcare replicates. How do I expand this wildcard such that it includes all files numbered between 1-22? I've also tried defining a function to do this for me at the suggestion of this SO post to no avail; still same error message.

def expandChromo(wildcards):
    return expand("Ne-sQTL_perind.counts.gz.qqnorm_chr{i}.gz",i=range(1,22))
...
rule QTLtools_filter:
    input:
        expandChromo,
        chk=".prepare_phen_table.chkpnt"
    output:
        expand("{wildcards.expandChromo}.qtltools")
    message:
        "Making phenotype files QTLtools compatible..."
    shell:
        "cat {wildcards.expandChromo} | awk '{ $4=$4\" . +\"; print $0 }' | tr " " \"\t\" | bgzip -c > {wildcards.expandChromo}.qtltools"
Dmitry Kuzminov
  • 6,180
  • 6
  • 18
  • 40
CelineDion
  • 906
  • 5
  • 21

1 Answers1

1

You need to have 2 rules. The first one (let's call it all) has no output but clearly states what do you want to get as the result of your pipeline:

rule all:
    input: expand("Ne-sQTL_perind.counts.gz.qqnorm_chr{i}.gz.qtltools", i=range(1,22))

This would give Snakemake an idea of your 22 target files.

Now you can teach Snakemake to create those files:

rule QTLtools_filter:
    input:
        "{file}.gz"
    output:
        "{file}.gz.qtltools"
    message:
        "Making phenotype files QTLtools compatible..."
    shell:
        "cat {input} | awk '{ $4=$4\" . +\"; print $0 }' | tr " " \"\t\" | bgzip -c > {input}.qtltools"

Note that this rule takes a single file as an input and single file as an output, and the wildcard allows Snakemake to match this pair for each i in your range. I didn't find any reason for setting chk=".prepare_phen_table.chkpnt" as an input, but this is something you may add if needed.

Dmitry Kuzminov
  • 6,180
  • 6
  • 18
  • 40
  • Thank you! Where is `{file}` defined? – CelineDion Aug 22 '19 at 14:17
  • @CelineDion `{file}` is not defined, it is a wildcard. When Snakemake finds that it needs a particular file, it starts looking for the patterns in the `output` sections of the rules that match the filename exactly. When it finds one (and if there is not ambiguity) it derives the value of each wildcard to fulfill the requirement. – Dmitry Kuzminov Aug 22 '19 at 14:51