I need help with one problem in Rosalind here is the link to the problem for more information: http://rosalind.info/problems/lcsm/
The following is the sample input:
>Rosalind_1
GATTACA
>Rosalind_2
TAGACCA
>Rosalind_3
ATACA
Sample output is :
AC
and that's the basic information of the input and output:
Given: A collection of k (k≤100) DNA strings of length at most 1 kbp each in FASTA format. Return: A longest common substring of the collection. (If multiple solutions exist, you may return any single solution.)
The problem is asking to find the longest shared string that is present in all three sequences, and the common longest string found in this example is AC
.
Now, moving to the issue with programming it.
The idea I had - and tried - is to take the first string GATTACA
and check all the combination of symbols and check if it is present in the sequence two and three.
How can I do that to loop over all possible bases for all the three sequences and find the longest motif in python?