3

Let suppose we have a list i python of 35000 values like:

a = ['235', '2589', '25896']

and string to match:

str = '258963548'
str2 = '258954213'
str3 = '258659652'

Now I want to match these strings to list to find longest match. result for first string will be 25896 while second match will return 2589 and finally last string will failed to match.

I have used regular expression to solve this issue but it's taking long as I have around 50 sets of list and around 200 strings to match each list.

here is my code:

def Matchit(str,b = []):
    pattern = re.compile("(?P<mt>\S*)\S*\s+(?P=mt)")
    ln = 0
    res = -1
    for a in b:
        match = pattern.match(str + ' ' + a).group('mt')
        if (len(match)>ln):
            ln = len(match)
            if(ln>2):
               res = b[a]
   return res

Any help will be appreciated.

Blender
  • 289,723
  • 53
  • 439
  • 496
sharafjaffri
  • 2,134
  • 3
  • 30
  • 47

2 Answers2

5

You can build a trie from the lists. Then you should be able to find the longest match very quickly

enter image description here

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
2

Why not sort the list into descending order and then the first match will be the longest. This way you don't have to run each iteration to completion.

JamesSwift
  • 863
  • 1
  • 7
  • 16
  • 1
    Even better if you build a pattern with them as alternatives, sorted in reverse order by length. – MRAB Aug 11 '12 at 19:50