0

I'm currently trying to get into bash regular expressions to change multiple filenames at the same time. Here are the file names:

a_001_D_xy_S37_L003_R1_001.txt
a_001_D_xy_S37_L003_R2_001.txt
a_002_D_xy_S37_L006_R1_001.txt
a_002_D_xy_S37_L006_R2_001.txt
a_003_D_xy_S23_L003_R1_001.txt
a_003_D_xy_S23_L003_R2_001.txt

I want this as my result:

 a_002_D_xy_R1.txt
 a_002_D_xy_R2.txt
 ...

I only want to change those with *001.txt at the end. First I want to remove the _S.._L00. in the filenames and the 001 in the end. I split this procedure in two parts:

for file in *001.txt;
do
echo ${file#_S.._L..6}
done

This loop already does not work. As a second alternative I tried:

for file in *001.fastq.gz; 
do
echo ${file/_S.._L00./}
done

but the filenames are again unchanged. (I just use echo here to see the results. If it works I will replace it with mv ${file} ${regularexpression}) Thanks for help!

ELHL
  • 137
  • 2
  • 12

2 Answers2

1

Considering that you need lots of different fields it is possibly better to just split the filename and then reconstruct it as you wish.

I suggest using an array built by splitting the original filename with _. Then you just reconstruct the new name by using the fields that you wish.

  for file in *001.txt; do
      echo "FILE: $file"

      IFS='_' read -r -a fileFields <<< "$file"
      echo "FILE FIELDS: "
      for index in "${!fileFields[@]}"; do
            echo "- $index ${fileFields[index]}"
      done

      fileName="${fileFields[0]}_${fileFields[1]}_${fileFields[2]}_${fileFields[3]}_${fileFields[-2]}.txt"
      echo "NEW FILE NAME: $fileName"
      # mv $file $fileName
  done

The echo commands are just for debuging, you can remove them all once you understand the code.

However, if you really need to split the string using BASH expressions you can check this post: Extracting part of a string to a variable in bash or take a look at this BASH cheat sheet.

Cristian Ramon-Cortes
  • 1,838
  • 1
  • 19
  • 32
-1

Try to make a function, you'll first have to decide the number (n) of files.

n=$(ls *_001.txt | wc -l)

functionRename(){

  for(( i=1; i <=n; i++))
    do
    file=$(ls *_001.txt | head -n $i | tail -n 1)
    mv "${file}" "${file%_S??_*}${file#???????????????????}"
    file2=$(ls *_001.txt | head -n $i | tail -n 1)
    mv "${file2}" "${file2%_001*}.txt"
    done
}

functionRename
benn
  • 198
  • 1
  • 11
  • This reads like a compilation of bad scripting practices. Try http://shellcheck.net/ to get rid of at least the worst offenses. – tripleee Nov 21 '17 at 09:58
  • Thanks for your comment, please explain what offenses you mean. I didn't try to offend anybody. – benn Nov 21 '17 at 11:16
  • Nothing personal! (-: Cick the link and paste in your script and it will tell you about the quoting problems and portability errors. I don't think it will warn about the extremely clunky and inefficient antipattern of repeatedly pulling an entry out of a file listing instead of simpy looping over the wildcard. – tripleee Nov 21 '17 at 11:20
  • See also http://mywiki.wooledge.org/ParsingLs and more generally mywiki.wooledge.org/BashPitfalls – tripleee Nov 21 '17 at 11:24
  • Thanks, I see your point about the double quotes. However, I am not sure how to prevent looping the with wildcards... – benn Nov 21 '17 at 11:34
  • `for file in *_001.txt; do mv "$file" "$newname"; done` just like in the other, upvoted answer on this page, – tripleee Nov 21 '17 at 12:40