I am using python 2.7 I am working with a fasta file containing DNA sequence of modern human Y chromosome. Actually it is a long string of about 20000000 characters like ATCGACGATCACACG.... I want to convert this very long string to a list of triad strings, for example this string:
My_sequence_string= "ATGTACGTCATAG"
to this list:
My_sequence_list= ["ATG","TAC","GTC","ATA"]
This is my code:
str_Reading_Frame1=open("Ychromosome.fa", "r").read()
list_Reading_Frame1=[]
def str_to_list(list, str):
if len(str)>2:
list.append(str[:3])
str_to_list(list, str[3:])
str_to_list(list_Reading_Frame1, str_Reading_Frame1)
But I see a memory limit error. I think that problem is calling the function inside it, but I don't know how to refine my code. I don't want to import modules, like Biopython, I wanna do it my self ( with your help :-) )