0

I'm trying to match all of the items in one list (list1) with some items in another list (list2).

list1 = ['r','g','g',]
list2 = ['r','g','r','g','g']

For each successive object in list1, I want to find all indices where that pattern shows up in list2:

Essentially, I'd hope the result to be something along the lines of:

"r is at indices 0,2 in list2" "r,g is at indices, 1,3 in list2" (I only want to find the last index in the pattern) "r,g,g is at index 4 in list2"

As for things I've tried: Well... a lot.

The one that has gotten closest is this:

print([x for x in list1 if x not in set(list2)])

This doesn't work fr me because it doesn't look for a group of objects, it only tests for one object in list1 being in list2.

I don't really need the answer to be pythonic or even that fast. As long as it works!

Any help is greatly appreciated! Thanks!

  • I suggest you take a look at this discussion. [How to compare lists](https://stackoverflow.com/questions/1388818/how-can-i-compare-two-lists-in-python-and-return-matches) – xertz Jan 27 '22 at 19:09

5 Answers5

1

This is quite an interesting question. Python has powerful list indexing methods, that allow you to efficiently make these comparisons. From a programming/maths perspective, what you are trying to do is compare sublists of a longer list with a pattern of your chosing. That can be implemented with:

# sample lists
pattern = [1,2,3]
mylist = [1,2,3,4,1,2,3,4,1,2,6,7,1,2,3]

# we want to check all elements of mylist
# we can stop len(pattern) elements before the end
for i in range(len(mylist)-len(pattern)):
    # we generate a sublist of mylist, and we compare with list pattern
    if mylist[i:i+len(pattern)]==pattern:
        # we print the matches
        print(i)

This code will print 0 and 4, the indexes where we have the [1,2,3] in mylist.

thecoder
  • 36
  • 5
  • Hi, I think this approach doesn't work when the pattern is of length 1 and the pattern being search for is at the last position of the list. For example, if `pattern = [3]` then this approach won't find the last position of the list. In any case, the solution was useful for me so thank you. – Alberto Feb 14 '23 at 08:47
0

Here's an attempt:

list1 = ['r','g','g']
list2 = ['r','g','r','g','g']

def inits(lst):
    for i in range(1, len(lst) + 1):
        yield lst[:i]

def rolling_windows(lst, length):
    for i in range(len(lst) - length + 1):
        yield lst[i:i+length]

for sublen, sublst in enumerate(inits(list1), start=1):
    inds = [ind for ind, roll
            in enumerate(rolling_windows(list2, sublen), start=sublen)
            if roll == sublst]
    print(f"{sublst} is in list2 at indices: {inds}")

# ['r'] is in list2 at indices: [1, 3]
# ['r', 'g'] is in list2 at indices: [2, 4]
# ['r', 'g', 'g'] is in list2 at indices: [5]

Basically, it generates relevant sublists using two functions (inits and rolling_windows) and then compare them.

j1-lee
  • 13,764
  • 3
  • 14
  • 26
0

Pure python solution which is going to be pretty slow for big lists:

def ind_of_sub_list_in_list(sub: list, main: list) -> list[int]:
    indices: list[int] = []
    for index_main in range(len(main) - len(sub) + 1):
        for index_sub in range(len(sub)):
            if main[index_main + index_sub] != sub[index_sub]:
                break
        else:  # `sub` fits completely in `main`
            indices.append(index_main)

    return indices


list1 = ["r", "g", "g"]
list2 = ["r", "g", "g", "r", "g", "g"]
print(ind_of_sub_list_in_list(sub=list1, main=list2))  # [0, 3]

Naive implementation with two for loops that check entry by entry the two lists.

raui100
  • 69
  • 8
0

Convert your list from which you need to match to string and then use regex and find all substring

import re
S1 = "".join(list2) #it will convert your list2 to string
sub_str = ""
for letter in list1:
    sub_str+=letter
    r=re.finditer(sub_str, S1)
    for i in r:
        print(sub_str , " found at ", i.start() + 1)

This will gives you starting index of the matched item

0

If all entries in both lists are actually strings the solution can be simplified to:

list1 = ["r", "g", "g"]
list2 = ["r", "g", "g", "r", "g", "g"]
main = "".join(list2)
sub = "".join(list1)
indices = [index for index in range(len(main)) if main.startswith(sub, index)]
print(indices)  # [0, 3]

We join both lists to a string and then use the startswith method to determine all indices.

raui100
  • 69
  • 8