1

I have some multiple files in a folder I want to shorten the names. Here are the input files

Input

S_12_O_319_K27ac_S12818.sorted.bam
S_12_O_319_K27me3_S12815.sorted.bam
S_12_O_319_K4me1_S12816.sorted.bam
S_12_O_319_K4me3_S12817.sorted.bam
S_14_AS_11_K27ac_S12843.sorted.bam
S_14_AS_11_K27me3_S12840.sorted.bam
S_14_AS_11_K4me1_S12841.sorted.bam
S_14_AS_11_K4me3_S12842.sorted.bam
S_12_O_319_K27ac_S12818.sorted.bam.bai
S_12_O_319_K27me3_S12815.sorted.bam.bai
S_12_O_319_K4me1_S12816.sorted.bam.bai
S_12_O_319_K4me3_S12817.sorted.bam.bai
S_14_AS_11_K27ac_S12843.sorted.bam.bai
S_14_AS_11_K27me3_S12840.sorted.bam.bai
S_14_AS_11_K4me1_S12841.sorted.bam.bai
S_14_AS_11_K4me3_S12842.sorted.bam.bai

Output

S_12_O_319_K27ac.bam
S_12_O_319_K27me3.bam
S_12_O_319_K4me1.bam
S_12_O_319_K4me3.bam
S_14_AS_11_K27ac.bam
S_14_AS_11_K27me3.bam
S_14_AS_11_K4me1.bam
S_14_AS_11_K4me3.bam
S_12_O_319_K27ac.bam.bai
S_12_O_319_K27me3.bam.bai
S_12_O_319_K4me1.bam.bai
S_12_O_319_K4me3.bam.bai
S_14_AS_11_K27ac.bam.bai
S_14_AS_11_K27me3.bam.bai
S_14_AS_11_K4me1.bam.bai
S_14_AS_11_K4me3.bam.bai

Note that my files have two different extensions, one is *.bam another is *.bam.bai. I want to rename of all of them at once to shorten the name. Remove the portion _S12843.sorted from all of them. Note that this is the 5th underscore while the number following _S12843 is different for different files. Only similar pattern is the string sorted. So would like to truncate that entire portion to shorten the name. How can I achieve that with a bash or rename or sed . Any help would be appreciated. I am able to remove the string with sorted but not the numbers.

ivivek_ngs
  • 917
  • 3
  • 10
  • 28
  • Possible duplicate of [Batch Renaming with Bash](http://stackoverflow.com/questions/602706/batch-renaming-with-bash) – n00dl3 Mar 16 '16 at 11:19

2 Answers2

3

Using rename utility you can do:

rename 's/_[^_.]+\.sorted//' *.sorted.*

If you don't have rename then use this for loop:

for f in *.sorted.*; do
   mv "$f" "${f/_S[[:digit:]]*.sorted}"
done
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    Both of them works, the prefer the one in loop as it is consuming less time. I did not new this `[:digit:]` thing. Can you please explain this to me once. – ivivek_ngs Mar 16 '16 at 11:30
  • 1
    `[[:digit:]]` is POSIX property to match a digit `[0-9]` in glob or regex patterns – anubhava Mar 16 '16 at 11:32
  • ah ok the only problem was then identifying the `_S` ideally this should also work `for f in *.sorted.*; do mv "$f" "${f/_S[[0-9]]*.sorted}"; done` – ivivek_ngs Mar 16 '16 at 11:52
  • note that this is the _perl script_ `rename` found in Debian/Ubuntu and their derivatives, but not necessarily present on other non-Debian based Linux boxes such as Arch Linux, CentOS, Slackware, etc, on those boxes I believe util-linux's rename is used. __tl;dr__ you cannot depend on the behavior of rename. – Alexej Magura Nov 08 '16 at 15:52
0

This might work for you (GNU sed):

sed -r 's/^(.*)_[^_.]*\.[^.]*(.*)/mv "&" "\1\2"/e' file

or:

sed -r 's/^(.*)_[^_.]*\.[^.]*(.*)/mv "&" "\1\2"/' file | shell

where shell could be bash etc

potong
  • 55,640
  • 6
  • 51
  • 83