-2

I have a list 15 files all with the ending ******_filteredSNPs.txt. The starred bit is individual names for each of the 15 file. How do I list all these files?

I need the output files to have the individual names at the start but with the ending clumped.

E.g.

cd /data/PRS/

i=$PBS_ARRAYID

file="${i}_filteredSNPs.txt"
out="${i}_clumped"


./plink \
    --bfile filesforPRS \
    --clump-p1 1 \
    --clump-r2 0.1 \
    --clump-kb 250 \
    --clump ${file} \
    --clump-snp-field ID \
    --clump-field P \
    --out ${out}

I am trying the above but get an error as it fails to open my input files.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
AvniKaur
  • 39
  • 1
  • 6
  • What will work depends on your operating system and shell; Windows `cmd` has wildly different handling of wildcards than Unix shells, for example. There are also many incompatible Unix shells like Fish and the Csh (dysfunctional :-) family. By the vague looks of your question, I'm guessing you are looking for the simple Bourne shell wildcard `*_filteredSNPs.txt` though it's not at all clear what the requirements of `--clump` are. Is this the name of an option of a command? Does it have a man page? – tripleee Jun 21 '22 at 11:20
  • Merely specifying a wildcard after the option name won't work because the shell will expand the wildcard, which looks to the calling program like the first of the expanded names is the argument to `--clump` and the rest are something else, typically just file names for the command to read. Voting to close as unclear anyway. – tripleee Jun 21 '22 at 11:22
  • Thanks for your response. Please see the edited question. I want to list a number of files to one command and the output should correspond to each input file. – AvniKaur Jun 21 '22 at 12:02
  • This requires `plink` to allow more than one `--clump` option; does it? – tripleee Jun 21 '22 at 12:49
  • Or do you mean you want to run `plink` 15 times with different pairs of arguments, like `--clump one_filtered.txt` with output to `one_clumped`, then `--clump two_filtered.txt` with output to `two_clumped`, etc? – tripleee Jun 21 '22 at 12:53
  • Possible duplicate of https://stackoverflow.com/questions/28725333/looping-over-pairs-of-values-in-bash – tripleee Jun 21 '22 at 12:54
  • I think you can specify more than one input files but I can't seem to code how to do this when the ending for each file is the same but the start of it is different. I also want the output file to match each input file – AvniKaur Jun 21 '22 at 13:18
  • Sounds like the proposed duplicate then. – tripleee Jun 21 '22 at 13:19

1 Answers1

1

Your question remains unclear and probably a duplicate, but I'm guessing it's either of the following. Please follow up in a comment, or edit your question to clarify it still.

Perhaps you are looking for a way to run plink on each matching file separately?

for file in *_filteredSNPs.txt; do
    ./plink \
        --bfile filesforPRS \
        --clump-p1 1 \
        --clump-r2 0.1 \
        --clump-kb 250 \
        --clump "$file" \
        --clump-snp-field ID \
        --clump-field P \
        --out "${file%_filteredSNPs.txt}_clumped"
done

Notice also how double quotes (but not braces {...}) are necessary to avoid problems when handling files with unusual characters in their names; see When to wrap quotes around a shell variable? The parameter expansion ${file%_filteredSNPs.txt} returns the value of the variable with the suffix after % removed.

This uses no Bash features, and so will work with any sh variant.

Or, if your plink command allows more than one --clump option, and you want to add them all into the same command line, you can just interpolate them into it.

# Put beginning of command into array
cmd=(./plink \
    --bfile filesforPRS \
    --clump-p1 1 \
    --clump-r2 0.1 \
    --clump-kb 250)
# Add matches to array
for file in *_filteredSNPs.txt; do
    cmd+=(--clump "$file")
done
# Then add tail of command
cmd+=(--clump-snp-field ID \
    --clump-field P \
    --out "$out")
# Finally, execute it
"${cmd[@]}"

If you have 15 matching files, this will add the --clump option 15 times, each followed by another one of the 15 file names, then run plink once.

Arrays are a Bash feature, so this will not work portably with sh.

tripleee
  • 175,061
  • 34
  • 275
  • 318