The "distance between 2 Strings"

Question

Hi this is from a previous test and i was wondering how to do it thank you We can define the distance between two strings as the number of positions at which the corresponding characters are different. For example, the distance between the strings “find” and “fund” is 1 since only the

Write a program that will input two strings from the user and will determine and output the distance between the strings i.e. how many characters, in corresponding positions of each string, are the different. Your program must output only the distance as defined above.

“karolin” “kathrin” distance 3 positions 2,3,4
“karolin” “Kerstin” distance 3 positions 1,3,4 etc
“1011101” “1001001” 2  2,4
“2173896” “2233796” 3  1,2,4
“ABCD” “ABCDEF” 2  4,5
“efghi” “efg” 2  3,4
“The quick!” “the Quick?” 1  9

Please edit the question and paste in your source code. No one here is eager to do your homework, or "previous test", work. — lit, May 02 '17 at 18:56
many possible different answers for this, many already exist on this site, possible duplicate: [Count letter differences of two strings](http://stackoverflow.com/questions/12226846/count-letter-differences-of-two-strings), an exact duplicate of [Counting differences between two strings](http://stackoverflow.com/questions/28423448/counting-differences-between-two-strings) which is used in the answer below — chickity china chinese chicken, May 02 '17 at 19:00
Consider using the builtin [difflib](https://docs.python.org/3/library/difflib.html) — tpvasconcelos, May 02 '17 at 19:20

Deem · Answer 1 · 2017-05-02T19:28:39.243

A simple solution would be as follows:

>>> s1 = "karolin"
>>> s2 = "kathrin"

>>> distance = sum([1 for x, y in zip(s1, s2) if x.lower() != y.lower()])
>>> distance
3

s1 and s2 define the strings that you will be working with. You can get these from the user using input() and raw_input for Python3 and 2 respectively.

Now for the brunt of the code in sum([1 for x, y in zip(s1, s2) if x.lower() != y.lower()]):

Zip() will 'zip' the inputs together, grouping corresponding elements from each together. In this case, zip(s1, s2) would give us ('k', 'k'), ('a', 'a'), ('r', 't'), ('o', 'h'), ('l', 'r'), ('i', 'i'), ('n', 'n').

x, y in zip will iterate through these pairings, assigning x to the first entry and y to second.

We then compare them with an if statement, if x.lower() != y.lower(). We use .lower() to convert the character to lower case; otherwise, 'k' and 'K' would not be considered equal. (You could also just call lower() on the strings immediately after they are entered.)

All of this inside the square brackets [..] is called a list comprehension. Since you now understand the subparts, as a whole, it is saying 'insert a 1 into a list for every x, y pair in the zipped list if x and y are not the same letter'.

This gives us a list containing a bunch of the integer 1, in this case, [1, 1, 1]. 1 for every difference in corresponding character. So, to get the total number of differences, we simply sum all of these values, by calling sum() on the entire list.

Do please click the green tick next to this answer if this helped you out! — Deem, May 03 '17 at 22:35

Jacob Lee · Answer 2 · 2020-11-06T16:49:29.100

You could use difflib.SequenceMatcher. It doesn't list out every index where the strings are identical, but gives the starting index and length of each identical block.

import difflib

str1 = "This is the first string"
str2 = "This is the second string"
s = difflib.SequenceMatcher(None, str1, str2)
for block in s.get_matching_blocks():
    i1, i2, length = block
    if i1 == i2 and length != 0:
        print(f'str1[{i1}] and str2[{i2}] for {length} elements')

Note: Remove the the i1 == i2 condition from the if statement if the matching sequences don't have to start on the same index.

The "distance between 2 Strings"

2 Answers2