1

I have been having some difficulty for some time producing a workflow with many inputs and a single output, such as is shown below. The code below works fine to some extent, however when there are too many input files the concatenate step invariably fails:

rule generate_text:
input:
    "data/{name}.csv"
output:
    "text_files/{name}.txt"
shell:
    "somecommand {input} -o {output}"

rule concatenate_text :
input:
    expand("text_files/{name}.txt", name=names)
output:
    "summaries/summary.txt"
shell:
    "cat {input} > {output}"

I have done some digging and found that this is attributable to a limitation on the number of characters that can be put in a single command. I am working with increasingly large numbers of inputs and therefore the above solution is not scalable.

Can anybody please propose any solutions to this issue? I haven't been able to find any online.

Ideally the solution wouldn't be one limited to just cat or other shell commands and could be employed within the structure of a rule in cases where --use-conda can be employed. My current fix involves using an onsuccess script as follows, but this doesn't allow use of --use-conda and rule specific conda environments.

One handy thing about the shell command is that you can feed it snakemake variables, but its not quite flexible enough for my purposes due to the aforementioned conda issue.

onsuccess:
    shell("cat text_files/*.txt > summaries/summary.txt")
Dijkgraaf
  • 11,049
  • 17
  • 42
  • 54
  • 1
    Does this answer your question? [Snakemake cannot handle very long command line?](https://stackoverflow.com/questions/64073269/snakemake-cannot-handle-very-long-command-line) – Maarten-vd-Sande Mar 01 '21 at 09:37
  • Thanks for the link but no, that didnt work for me. There was apparently a release of snakemake 6.0 just two days ago which has added in a fix for this. I will give it a try! Snakemake changelog [6.0.0] - 2021-02-26 - Use temporary files for long shell commands (@epruesse). https://github.com/snakemake/snakemake/pull/856 – Max Cummins Mar 01 '21 at 09:59
  • I add an answer to https://stackoverflow.com/questions/64073269/snakemake-cannot-handle-very-long-command-line using snakemake 6 (sorry, I don't mean to get someone else's credit). I guess this question can be marked as duplicate? – dariober Mar 02 '21 at 09:42

0 Answers0