Finding different characters in strings of different lengths

Question

I have some code to find the differences between strings. At the moment it works for strings of the same length, but I am trying to get it to work for strings of different length. How can I do this

I added in a new variable longest_seq to try and work around this but I'm not sure how to use it.

ref_seq = "pandabears"
map_seq = "pondabear"
longest_seq = map_seq

if len(ref_seq) > len(map_seq):
    longest_seq == ref_seq


for i in range(len(longest_seq)):
    if ref_seq[i] != map_seq[i]: 
        print i, ref_seq[i], map_seq[i]

Can you be more clear about what it means to find "the differences" between two strings? One common way of computing this is the edit distance: https://en.wikipedia.org/wiki/Edit_distance. Your current code looks like it is printing characters if they differ at a specific spot. — kingkupps, Sep 12 '19 at 22:47
@kingkupps Printing characters that differed at a specific spot is what I am trying to do. Apologies for confusion. — wilberox, Sep 12 '19 at 23:09

DjaouadNM · Answer 1 · 2019-09-12T23:17:25.660

2

For Python 2, you can use itertools.izip for this:

from itertools import izip

for i, j in izip(ref_seq, map_seq):
    if i != j: 
        print i, j

Output:

a o

In Python 3, you can use the built-in zip function:

for i, j in zip(ref_seq, map_seq):
    if i != j: 
        print(i, j)

zip exists in Python 2, but itertools.izip is recommended because it generates the tuples at demand (in every iteration it generates a new tuple) rather than building all of them at once, in Python 3, zip does what itertools.izip does in Python 2.

edited Sep 12 '19 at 23:17

answered Sep 12 '19 at 22:45

DjaouadNM

22,013
4
33
55

how can I edit my code to make it work on different length strings though? – wilberox Sep 12 '19 at 23:10
@wilberox What should happen for different length strings? `zip` will take care of the _taking the shortest length string_ problem. – DjaouadNM Sep 12 '19 at 23:16

score 0 · Accepted Answer · answered Sep 12 '19 at 23:19

Something like this should do the trick.

def different_characters(reference, target):
    # So we don't accidentally index the shorter string past its ending
    ending = min(len(reference), len(target))

    for i in range(ending):
        if reference[i] != target[i]:
            print(i, reference[i], target[i])

    longer_str = reference if len(reference) > len(target) else target
    for i in range(ending, len(longer_str)):
        print(i, longer_str[i], '<empty>')


different_characters('pandabears', 'pondabear')

Which would print:

1 a o
9 s <empty>

Finding different characters in strings of different lengths

2 Answers2

Linked