Match two strings (char to char) till the first non-match using python

Question

I am trying to match two strings sequentially till the first the non-matched character and then determine the percentage exact match. My code is like this:

def match(a, b):
    a, b = list(a), list(b)
    count = 0
    for i in range(len(a)):
        if (a[i]!= b[i]): break
        else: count = count + 1
    return count/len(a)

a = '354575368987943'
b = '354535368987000'
c = '354575368987000'
print(match(a,b)) # return 0.267
print(match(a,c)) # return 0.8

Is there any built-in method in python already which can do it faster ? For simplicity assume that both strings are of same length.

The closest thing to this is `difflib`'s [`SequenceMatcher.get_matching_blocks`](https://docs.python.org/2/library/difflib.html#difflib.SequenceMatcher.get_matching_blocks) : http://ideone.com/wlUVd9 — Ashwini Chaudhary, Jul 17 '17 at 21:42
strings can be manipulated as lists, there is no need to `list()` them. — TemporalWolf, Jul 17 '17 at 21:43
Best answer was already provided in the comment to https://stackoverflow.com/questions/18715688/find-common-substring-between-two-strings — AGN Gazer, Jul 17 '17 at 21:50
Also check out https://codereview.stackexchange.com/questions/124144/finding-the-fastest-common-prefix-of-2-strings-in-python — AGN Gazer, Jul 17 '17 at 21:53

score 7 · Accepted Answer · answered Jul 17 '17 at 21:33

7

There's no built-in to do the entire thing, but you can use a built-in for computing the common prefix:

import os
def match(a, b):
    common = os.path.commonprefix([a, b])
    return float(len(common))/len(a)

answered Jul 17 '17 at 21:33

bogdanciobanu

1,603
1
12
14

Excellent find! – AGN Gazer Jul 17 '17 at 21:40

score 4 · Answer 2 · answered Jul 17 '17 at 21:33

I don't think there is such build-in method.

But you can improve your implementation:

No need to wrap the inputs in list(...). Strings are indexable.
No need for count variable, i already carries the same meaning. And you can return immediately when you know the result.

Like this, with some doctests added as a bonus:

def match(a, b):
    """
    >>> match('354575368987943', '354535368987000')
    0.26666666666666666

    >>> match('354575368987943', '354575368987000')
    0.8

    >>> match('354575368987943', '354575368987943')
    1
    """
    for i in range(len(a)):
        if a[i] != b[i]:
            return i / len(a)

    return 1

Anton vBR · Answer 3 · 2017-07-17T21:57:57.827

0

alternative

(Just now saw that the answer below me thought of the same thing while I was editing the post)

def match(l1, l2):
    # find mismatch
    try:
        stop = next(i for i, (el1, el2) in enumerate(zip(l1, l2)) if el1 != el2)
        return stop/len(l1)
    except StopIteration:
        return 1

edited Jul 17 '17 at 21:57

answered Jul 17 '17 at 21:35

Anton vBR

18,287
5
40
46

Match two strings (char to char) till the first non-match using python

3 Answers3

alternative