I have a text file containing 3 columns - stop codon, skipping context and a sequence of 102 bases which come immediately after the skipping context which looks a bit like this
TAG GTTAGCT CTCGTGGTCCTCAAGGACTCAGAAACCAGGCTCGAGGCCTATCCCAGCAAGTGCTGCTCTGCTCTGCCCACCCTGGGTTCTGCATTCCTATGGGTGACCC
TAG GTTAGCT CTTATTCCCAGTGCCAGCTTTCTCTCCTCACATCCTCATAATGGATGCTGACTGTGTTGGGGGACAGAAGGGACTTGGCAGAGCTTTGCTCATGCCACTC
TAG GTTAGCT CTATTGTGTAACTGAGCAATTCTTTTCACTCTTGTGACTATCTCAGTCCTCTGCTGTTTTGTAACTGGTTTACCTCTATAGTTTATTTATTTTTAAATTA
etc...
I want to know how I can write a program to read the 3rd column of this text file (i.e. the 102 base sequence) and I need it to read in chunks of threes and pick out any stop codons from the sequence - 'TAG', 'TGA', or 'TAA' and create a list or table or something similar to tell me if each sequence contains any of these stop codons and if so, how many.
So far I have done this to get Python to read only the 3rd column of that text file:
inFile = open('test stop codon plus 102.txt', 'rU')
outFile = open('TAG plus 102 reading inframe.txt', 'w')
for line in inFile:
parts = line.split('\t')
stopcodon = parts[0]
skippingcontext = parts[1]
plus102 = parts[2]`
But I'm not sure where to go next.
Thanks in advance!