I have searched this for some time now, and this thread is the closest I got, but could not get working with my setup.
What I want to do:
I have one text file where every line has an ID and a data point
1234 data2
5678 data3
...
I want to collect the lines that correspond to certain IDs, which I have in my config file, and write them to their own files named according to the IDs value (1234 or 5678)
# config.yaml
IDs:
ID1: 1234
ID2: 5678
When I did this without snakemake, I just looped over the list of IDs in my bash script and grepped the text file for them, but I just cannot accomplish this with snakemake.
Either I have an issue with wildcards in target, or my expand function gives all of the IDs to the grep command in shell, or when following that accepted linked answer, I get "missing input files for rule all: And_Laa A_log" I can share what I have now, but I think the correct way to do this is so far removed from what I have, that it will just confuse everyone:
configfile: "config.yaml"
# Trying to replicate stackoverflow answer
speakers = {
"1": "And_Laa",
"2": "A_log"
}
def get_speaker(wildcards):
# return expand("{speaker}", speaker=config["speakers"])
return speakers[wildcards.speaker]
rule all:
input:
# expand("{speaker}_wav-list", speaker=config[speakers])
expand("{speaker}", speaker=speakers.values())
# Selecting all the audiofiles for the speakers from a very large file
rule select_speaker_files:
input:
wav=config["files"]["wavs"]
output:
speaker="{speaker}_wav-list"
params:
speaker=get_speaker,
shell:
'grep "{params.speaker}" {input.wav} > {output.speaker}'