I have the following lines in test.fa :
#test.fa
>1
AGAGGGAGCTG
CCTCAGGGCTG
CACTCAGGAAA
TTGGGGCGCTG
AGCATGGGGGG
CAGGAGGGGCC
I need to ignore the lines starting with ">" , and concatenate the following lines into one single string. The following script however not only skips lines with ">" , but also the next line before concatenating remaining.
#!/usr/bin/env python
import sys
import re
string = ""
with open("test.fa","rt") as f:
for line in f:
if re.match(">",line):
line = f.next()
else:
line = line.rstrip("\n")
string = string + line
print (string)
Could anyone help fix the script , or suggest better ways to do it? thanks !!