2

I am writing this rule:

rule process_files:
    input: 
        dataout=expand("{{dataset}}/{{sample}}.{{ref}}.{{state}}.{{case}}.myresult.{name}.tsv", name=my_list[wildcards.ref]) 
    output:
        "{dataset}/{sample}.{ref}.{state}.{case}.endresult.tsv"
    shell:
        do something ...

Were expand will get value from dictionary my_dictionary based on the ref value. I used wildcards like this my_dictionary[wildcards.ref]. But it ends up with this error name 'wildcards' is not defined

my_dictionary something like: {A:[1,2,3], B:[s1,s2..].....}

I could use

def myfun(wildcards):
    return expand("{{dataset}}/{{sample}}.{{ref}}.{{state}}.{{case}}.myresult.{name}.tsv", name=my_dictionary[wildcards.ref])

and use myfun as input , but this does not answer why I can not use expand in place directly

Any suggestion how to fix it?

Medhat
  • 1,622
  • 16
  • 31

2 Answers2

1

As @dariober mentioned there is the wildcards objects but this is only accesible in the run/shell portion but can be accessed using an input function in input.

Here is an example implementation that will expand the input based on the wildcards.ref:

rule all:
    input: expand("{dataset}/{sample}.{ref}.{state}.{case}.endresult.tsv", dataset=["D1", "D2"], sample=["S1", "S2"], ref=["R1", "R2"], state=["STATE1", "STATE2"], case=["C1", "C2"])


my_list = {"R1": [1, 2, 3], "R2": ["s1", "s2"]}

rule process_files:
    input:
        lambda wildcards: expand(
            "{{dataset}}/{{sample}}.{{ref}}.{{state}}.{{case}}.myresult.{name}.tsv", name=my_list[wildcards.ref])
    output:
        "{dataset}/{sample}.{ref}.{state}.{case}.endresult.tsv"
    shell:
        "echo '{input}' > {output}"

If you implement it as the lambda function example above, it should resolve the issue you mention:

The function worked but it did not resolve the variable between double curly braces so it will ask for input for {dataset}/{sample}.{ref}.{state}.{case}and raise an error.

JohnnyBD
  • 151
  • 1
  • 5
  • Actually my function is the same as your lambda function and raises this error. def `myfun(wildcards): return expand("{{dataset}}/{{sample}}.{{ref}}.{{state}}.{{case}}.myresult.{name}.tsv", name=my_list[wildcards.ref])` . to overcome the issue I need to resolve each var for example `ref` . would be `wildcards.ref`` and so on. – Medhat Nov 14 '18 at 19:38
  • There should not really be need to do that. You are saying you pass to expand, in the case of `{dataset}`, `dataset = wildcards.dataset`? Seems redundant. I am using snakemake 5.3.0 in the example and it works using your `myfun` or lambda. – JohnnyBD Nov 14 '18 at 20:43
  • The issue is after using expand; the variable passed to `sample` is `{sample}` so it would be `sample={sample}` not the actual value of sample, which makes problem in processing for next step because now there is nothing called `{dataset}/{sample...}` in the input file – Medhat Nov 14 '18 at 21:23
  • I am sorry but I cannot seem to reproduce this issue you are mentioning. Could you maybe edit your question and provide example of what an input would look like for one input wildcard combination? Either I am misunderstanding what are you trying to do or our implementations are different? You want to have a single value for all the wildcards except `name`? Essentially group a set of `name` inputs together? In that case you should have `{sample}` in the result of expand as that wildcard will be deduced from `rule all` and output. – JohnnyBD Nov 14 '18 at 22:26
0

Your question seems similar to snakemake wildcards or expand command and the bottom line is that wildcards is not defined in the input. So your solution of using an input function (or a lambda function) seems correct.

(As to why wildcards is not defined in input, I don't know...)

dariober
  • 8,240
  • 3
  • 30
  • 47
  • Thanks, The function worked but it did not resolve the variable between double curly braces so it will ask for input for `{dataset}/{sample}.{ref}.{state}.{case}`and raise an error. – Medhat Nov 14 '18 at 16:37