0

I have a folder (split_libs) with subfolders named according to a sample_name described in columns 9 and 32 of SraRunTable3.txt, each associated to a sra_study. Inside each subfolder there is a seqs.fna file, for which I unfortunately can't change the name - it's an output of a QIIME command.

I want to merge seqs.fna files within subfolders according to sra_study by reading the subfolder name (=sample_name). e.g. all seqs.fna from a same SRA study would be merged.

An example overview of the directory:

split_libs
    sample1
      seqs.fna
    sample2
      seqs.fna
    sample3
      seqs.fna

An example overview of the SraRunTable:

(...)Sample_Name(...)SRA_Study(...)
     sample_1        study_1
     sample_2        study_1 
     sample_3        study_2

Here's what I've tried so far:

import os
from operator import itemgetter

fields = itemgetter(9, 32)

with open('/home/andre/Desktop/PRJEB0000/SraRunTable3.txt') as csvfile:
next(csvfile)
for line in csvfile:
    sample_name, sra_study = fields(line.split())
for folder in os.listdir('./split_libs'):
    if folder == sample_name:
        open('seqs.fna') as infile, open('/home/andre/Desktop/PRJEB0000/cat_fna/' + sra_study + ".fna", 'a') as outfile:
            outfile.write(infile.read())

This question spinned-off of Joining files by corresponding columns in outside table

Any contributions will be appreciated!

Community
  • 1
  • 1
André Soares
  • 309
  • 1
  • 13

1 Answers1

0
import os
from operator import itemgetter

fields = itemgetter(9, 32)

with open('/home/andre/Desktop/PRJEB0000/SraRunTable3.txt') as csvfile:
next(csvfile)
for line in csvfile:
    sample_name, sra_study = fields(line.split())
    #open the folder corresponding to sample_name and add the seqs to the appropriate study file
    with open('split_libs/'+sample_name+'/seqs.fna') as infile, open('/home/andre/Desktop/PRJEB0000/cat_fna/' + sra_study + ".fna", 'a') as outfile:
            outfile.write(infile.read())

All credits to Amanda Clare (not registered on Stackoverflow)!

André Soares
  • 309
  • 1
  • 13