1

If I have a list like this

my_list = [1, 7, 3, 4, 2, 9, 6, 5, 8]

Also I have other lists of length less than the length of my_list. For instance, let us say I have two other lists as follows:

my_other_list_1 = [1, 7, 3, 4, 2]
my_other_list_2 = [1, 7, 4, 9, 8]

All lists have distinct elements and the other lists have elements from the original list.

My problem is to find the list (either my_other_list_1 or my_other_list_2) that has the longest match in my_list. First, I would like to ask what is this problem called? Then, I prefer to get the length of the longest match of each list. How do I do this?

In the example, I would return my_other_list_1 because it has a match of length 5 since [1, 7, 3, 4, 2] is already in my_list. Also, I would return 2 for my_other_list_2 because there is a match of length 2 in it, i.e., [1, 7].

Summarizing, if I have an algorithm A and the inputs are my_list, my_other_list_1 and my_other_list_2, the algorithm should return

my_other_list_1 has match 5
my_other_list_2 has match 2

NB. I think this is a problem called longest common subsequences (LCS) problem but as I understand in LCS problem the subsequences do not need to be consecutive.

1-approximation
  • 321
  • 1
  • 4
  • 13
  • 1
    If you don't have the requirement of coding this yourself, you could use a `difflib.SequenceMatcher` -- they even have a `find_longest_match` method :-). – mgilson Jun 08 '16 at 15:32
  • 2
    This problem is called the longest common substring problem. I've also heard it referred to as the longest contiguous subsequence problem. You can find more information [here](https://en.wikipedia.org/wiki/Longest_common_substring_problem), and the code for this problem is the second part of [this answer](http://stackoverflow.com/a/24547864/5377941) – Greg Jun 08 '16 at 15:45

2 Answers2

1
def longest_match(main_list, *other_lists, **options):
    try:
        if options:
            ordered = options['ordered']
        else:
            raise Exception
    except:
        ordered = False

    best_match = 0
    longest_match = 0
    for index, list in enumerate(other_lists):
        current_match = 0
        if not ordered:
            while True:
                if list[:current_match] == main_list[:current_match]:
                    current_match += 1
                else:
                    if current_match > longest_match:
                        longest_match = current_match
                        best_match = index
                    break
        else:
            for i, letter in enumerate(list):
                current_match = i
                if letter == main_list[0]:
                    current_match += 1
                    while True:
                        if list[i:current_match] == main_list[:current_match]:
                            current_match += 1
                        else:
                            if current_match > longest_match:
                                longest_match = current_match
                                best_match = index
                            break
    return other_lists[best_match]

print longest_match(my_list, my_other_list_1, my_other_list_2, ordered=True) #kwarg odered=True will allow for mid list comprehension

I'm on my phone and havnt tested the following, but it should do the trick!

TheLazyScripter
  • 2,541
  • 1
  • 10
  • 19
1

I'd be more than happy to help bud! Here you go!

def matcher_algorithm_large(original_list, **kwargs):
match_dict = {}
for i in kwargs:
    counter=0
    for j in kwargs[i]:
        if j == original_list[kwargs[i].index(j)]:
            counter += 1
        else:
            break
    match_dict[i]=counter
for key,value in match_dict.items():
    print('{} has match {}'.format(key,value))

This so basically you can plug them in like so:

my_list = [1, 7, 3, 4, 2, 9, 6, 5, 8]
my_other_list_1 = [1, 7, 3, 4, 2]
my_other_list_2 = [1, 7, 4, 9, 8]

matcher_algorithm_large(my_list,my_other_list_1=my_other_list_1,my_other_list_2=my_other_list_2)

This will give you the following output:

my_other_list_1 has match 5
my_other_list_2 has match 2

I hope this helps you out. I've done it so you can pass in as many lists as you would like. The inputs could be made a lot cleaner if you had a dict of lists, but that up to you if you'd like to have it that way. I'd be more than happy to script that. :)

  • this is assuming that you wish to maintain the names of the lists in the outputs as opposed to that other answer – Philip Kalinda Jun 08 '16 at 16:25
  • 1
    this doesn't find matches that begin in the middle of the original_list. – Greg Jun 08 '16 at 16:49
  • Oh, I didn't think that was a requirement based on the question description. Slightly more challenging problem but I can give it a go should it be required :) – Philip Kalinda Jun 09 '16 at 08:38