0

I have ten directories, and each directory has around 10-12 bam files. I need to use picard package to merge them together and I want to find a way to do it better.

basic command:
java -jar picard.jar MergeSamFiles \
  I=input_1.bam \
  I=input_2.bam \
  O=merged_files.bam

directory 1:
java -jar picard.jar MergeSamFiles \
  I=input_16.bam \
  I=input_28.bam \
  I=input_81.bam \
  I=input_34.bam \
  ... \
  ... \
  I=input_10.bam \
  O=merged_files.bam

directory 2:
java -jar picard.jar MergeSamFiles \
  I=input_44.bam \
  I=input_65.bam \
  I=input_181.bam \
  I=input_384.bam \
  ... \
  ... \
  I=input_150.bam \
  O=merged_files.bam

How can I add the Input by using variable if they are not in sequential, and I would like to do the for loop of those ten directories but they contain different number of bam files.

Should I use python or R to do it or keep on using shell script ? Please advice.

Peter Chung
  • 1,010
  • 1
  • 13
  • 31

1 Answers1

2

Why not use samtools?

for folder in my_bam_folders/*; do
    samtools merge $folder.bam $folder/*.bam
done

In general, samtools merge can merge all the bam files in a given directory like this:

samtools merge merged.bam *.bam

EDIT: If samtools isn't an option and you have to use Picard, what about something like this?

for folder in my_bam_folders/*; do
    bamlist=$(for f in $folder/*.bam; do echo -n "I=$f " ; done)
    java -jar picard.jar MergeSamFiles $bamlist O=$folder.bam
done
Niema Moshiri
  • 909
  • 5
  • 14
  • I got an error in samtools for merge there are something about the add ReadGroup and picard seems like don't have this error – Peter Chung Dec 22 '17 at 03:21
  • I added an option that should automate creating that `I=` list for you, can you see if that works? – Niema Moshiri Dec 22 '17 at 03:27
  • The `bamlist` won't work correctly for nontrivial file names. You want to collect the file names into an array instead with `bamlist=$("$folder"/*.bam)` and interpolate it with `java -jar picard.jar MergeSamFiles "${bamlist[@]/#/I=}" O="$folder.bam"` to add the `I=` prefix to each item in the array. – tripleee Dec 22 '17 at 05:54