0

I tried to create a loop to get all the filenames and fileID. Here are the files:

./SRR14194206_rmdup_bowtie_hg38_sorted_bowtie_tryhard_minus_bottom.bed
./SRR14194206_rmdup_bowtie_hg38_sorted_bowtie_tryhard_plus_top.bed
./SRR14194207_rmdup_bowtie_hg38_sorted_bowtie_tryhard_minus_bottom.bed
./SRR14194207_rmdup_bowtie_hg38_sorted_bowtie_tryhard_plus_top.bed
./SRR14194208_rmdup_bowtie_hg38_sorted_bowtie_tryhard_minus_bottom.bed
./SRR14194208_rmdup_bowtie_hg38_sorted_bowtie_tryhard_plus_top.bed
./SRR14194209_rmdup_bowtie_hg38_sorted_bowtie_tryhard_minus_bottom.bed
./SRR14194209_rmdup_bowtie_hg38_sorted_bowtie_tryhard_plus_top.bed

Here is my code

dataset=$(find -maxdepth 1 -name "*_rmdup_bowtie_hg38_sorted_bowtie_tryhard_*" | sort -V)
echo "$dataset"
./SRR14194206_rmdup_bowtie_hg38_sorted_bowtie_tryhard_minus_bottom.bed
./SRR14194206_rmdup_bowtie_hg38_sorted_bowtie_tryhard_plus_top.bed
./SRR14194207_rmdup_bowtie_hg38_sorted_bowtie_tryhard_minus_bottom.bed
./SRR14194207_rmdup_bowtie_hg38_sorted_bowtie_tryhard_plus_top.bed
./SRR14194208_rmdup_bowtie_hg38_sorted_bowtie_tryhard_minus_bottom.bed
./SRR14194208_rmdup_bowtie_hg38_sorted_bowtie_tryhard_plus_top.bed
./SRR14194209_rmdup_bowtie_hg38_sorted_bowtie_tryhard_minus_bottom.bed
./SRR14194209_rmdup_bowtie_hg38_sorted_bowtie_tryhard_plus_top.bed

dataNameTail="_rmdup_bowtie_hg38_sorted_bowtie_tryhard_"
datasetID=$(basename $(echo "$dataset"| sed "s/$dataNameTail/_/g"))

Here is the error:

basename: extra operand `./SRR14194207_minus_bottom.bed'
Try `basename --help' for more information.

I wondered if the problem is about quoting so I quote all the variable for basename but it couldn't loop all my files in $dataset

datasetID=$(basename "$(echo "$dataset"| sed "s/$dataNameTail/_/g")")
echo "$datasetID"
SRR14194209_plus_top.bed

Any insights on what I'm doing wrong? Thank you in advance!

minhntran
  • 1
  • 1
  • Quotes matter. `basename $(...)` and `basename "$(...)"` are not the same; run your code through http://shellcheck.net/ and it'll point that out. – Charles Duffy Aug 31 '22 at 15:58
  • BTW, processing `find` output as a string variable is not great form -- filenames are allowed to contain newlines, and unless you're very careful with how you use that string you'll run into the bugs described in [BashPitfalls #1](https://mywiki.wooledge.org/BashPitfalls) even with less-perilous names (like ones with spaces or glob characters). Much safer to NUL-delimit the output and store it in arrays instead. – Charles Duffy Aug 31 '22 at 15:59
  • Anyhow -- `basename` only takes one filename per call; you can't pass it a string with a whole bunch of names and expect it to do the right thing. – Charles Duffy Aug 31 '22 at 16:01
  • `readarray -d '' names < <(find -maxdepth 1 -name "*_rmdup_bowtie_hg38_sorted_bowtie_tryhard_*" -printf '%P\0' | sort -zV); allBasenames=( "${names[@]##*/}" ); datasetIdNumbers=( "${allBasenames[@]%%_rmdup_bowtie*}" )` is probably along the lines of what you want. Use `declare -p names allBasenames datasetIdNumbers` to print the contents of those arrays after running the above code. (Note that to expand an array you need to use `"${arrayname[@]}"`) – Charles Duffy Aug 31 '22 at 16:02

0 Answers0