I have given a string
ATGCCAGGCTAGCTTATTTAA
and I have to find out all substrings in string which starts with ATG and end with either of TAA, TAG, TGA.
Here is what I am doing:
seq="ATGCCAGGCTAGCTTATTTAA"
pattern = re.compile(r"(ATG[ACGT]*(TAG|TAA|TGA))")
for match in re.finditer(pattern, seq):
coding = match.group(1)
print(coding)
This code is giving me output:
ATGCCAGGCTAGCTTATTTAA
But actual output should be :
ATGCCAGGCTAGCTTATTTAA, ATGCCAGGCTAG
what I should change in my code?