A simple approach would be to roll through the sequence and calculate hamming distance for each alignment of the query 'LKLD' to the subject sequence 'LELFLKEF'. There is a sample implementation of hamming distance calculation in the linked wikipedia article. Once you have that your code would do something like:
# hamming distance
d = lambda s1, s2: sum(e1 != e2 for e1, e2 in zip(s1, s2))
subject = 'LELFLKEF'
query = 'LKLD'
for i in range(len(subject)-len(query)+1):
aligned_subject = subject[i:i+len(query)]
if d(aligned_subject, query) == 2:
print(aligned_subject)
Output:
LELF
LFLK
LKEF
Note that this is a bit of a naive solution with plenty of room for optimization, but it will work for simple cases and reasonably small strings. A condensed version that produces a list:
s='LELFLKEF'
q='LKLD'
d= lambda s1, s2: sum(e1 != e2 for e1, e2 in zip(s1, s2))
[s[i:i+len(q)] for i in range(len(s)-len(q)+1) if d(s[i:i+len(q)],q) == 2]
The for
loop rolls through all possible ungapped alignments of your two strings:
0
LELFLKEF
||||
LKLD
1
LELFLKEF
||||
LKLD
2
LELFLKEF
||||
LKLD
3
LELFLKEF
||||
LKLD
4
LELFLKEF
||||
LKLD
There are also many implementations for the problem of alignment of biological sequences so you might also want to explore some more involved techniques that handle things like gapped alignment and more complicated modeling of substitutions