42

I need to check if list1 is a sublist of list2 (True; if every integer in list2 that is common with list1 is in the same order of indexes as in list1)

def sublist(lst1,lst2):
    for i in range(len(lst1)):
        if lst1[i] not in lst2:
            return False
        for j in range(len(lst2)):
            if (lst1[j] in lst2) and (lst2.index(lst1[i+1]) > lst2.index(lst1[i])):
                return True

Can anybody help me... why isn't this working?

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
MMM
  • 521
  • 2
  • 5
  • 10
  • 1
    Can you give an example of when this should return True and when False? – L3viathan Mar 12 '16 at 22:38
  • Well, for one, you are returning `True` on the first hit in the second loop, when you would probably want to return `False` on the first mishit, and `True` when the loop finished. – user2390182 Mar 12 '16 at 22:41
  • 4
    Do duplicates in list1 have to occur as many times in list2? – user2390182 Mar 12 '16 at 22:43
  • 1
    If you want appearances of sublist elements in the superlist to be consecutive, the following one-liner does the job: `def sublist(sublst, lst):` `return sum([sublst == lst[i: i + len(sublst)] for i in range(len(lst) - len(sublst))]) > 0` – Daniel Moskovich Oct 25 '18 at 08:21
  • OP's definition is not what is usually referred to as a sublist, as only common elements are considered. – AllBecomesGood Jan 25 '22 at 14:12

24 Answers24

20

An easy way to check if all elements of a list are in other one is converting both to sets:

def sublist(lst1, lst2):
    return set(lst1) <= set(lst2)
19

i need to check if list1 is a sublist to list2 (True; if every integer in list2 that is common with list1 is in the same order of indexes as in list1)

Your code isn't working because as soon as a list element in ls1 doesn't occur in ls2 it will return False immediately.

This creates two lists that contain only the common elements (but in their original order) and then returns True when they are the same:

def sublist(lst1, lst2):
   ls1 = [element for element in lst1 if element in lst2]
   ls2 = [element for element in lst2 if element in lst1]
   return ls1 == ls2

edit: A memory-efficient variant:

def sublist(ls1, ls2):
    '''
    >>> sublist([], [1,2,3])
    True
    >>> sublist([1,2,3,4], [2,5,3])
    True
    >>> sublist([1,2,3,4], [0,3,2])
    False
    >>> sublist([1,2,3,4], [1,2,5,6,7,8,5,76,4,3])
    False
    '''
    def get_all_in(one, another):
        for element in one:
            if element in another:
                yield element

    for x1, x2 in zip(get_all_in(ls1, ls2), get_all_in(ls2, ls1)):
        if x1 != x2:
            return False

    return True
L3viathan
  • 26,748
  • 2
  • 58
  • 81
  • Inefficient in memory, especially for large lists, but at least it takes into account the numbers of each :) – Goodies Mar 12 '16 at 23:13
  • 2
    @Goodies Added a memory-efficient version :P – L3viathan Mar 13 '16 at 01:02
  • The variables in the `get_all_in` function should be `lst1` and `lst2`, but that's just a typo. I like it! Adding a supplemental answer. – Goodies Mar 13 '16 at 01:07
  • @Goodies I don't see the typo. – L3viathan Mar 13 '16 at 01:46
  • Maybe it's my bad. You changed the parameters from `lst1` to `ls1` in the first and second ones respectively. Apologies. – Goodies Mar 13 '16 at 01:47
  • 1
    I think sublist([1,2], [1,3,5,6,2]) is True when it should be False. How would you do it if order mattered for the first list to be considered a sublist? – Dave Thomas Jan 19 '18 at 16:23
  • The common integers (1, 2) are in the same order. – L3viathan Jan 19 '18 at 16:29
  • @L3viathan :lst2 = [4,8,9,33,44,67,123] lst1 = [8,33,44444] is false but the first code returns True! – Ed S Dec 03 '19 at 01:20
  • 2
    @EdS In fact, both return True, which is the correct answer given OP's admittedly weird definition of sublist (" if every integer in list2 that is common with list1 is in the same order of indexes as in list1"). – L3viathan Dec 03 '19 at 07:55
  • @L3viathan: How could I change the code to " if every integer in list2 that is common with list1 is in the same order of indexes as in list1" That is what I am trying to solve! Could you help me? – Ed S Dec 03 '19 at 13:42
  • Hi @L3viathan when I tested your solution on my dataset, it did not work, returning all as "True". However sublist3 suggested by "@Goodies" in the answers bellow worked. – El_1988 Dec 10 '20 at 11:17
  • 3
    This doesn't work if order matters and there's duplicate items in the superlist. Example: [1,2,3] should be sublist of [1,2,1,2,3]. – AllBecomesGood Jan 25 '22 at 13:45
9

Another easy way is to use list comprehension And use the built-in function all to verify that all items in list1 are contained in list2.

Example:

list1 = ['1','2']
list2 = ['1','2',3]

all(i in list2 for i in list1)
8

Another way that we do this is with collections.Counter. @L3viathan's second answer is the most efficient and fastest way to do it.

def sublist1(lst1, lst2):
    ls1 = [element for element in lst1 if element in lst2]
    ls2 = [element for element in lst2 if element in lst1]
    return ls1 == ls2


def sublist2(lst1, lst2):
    def get_all_in(one, another):
        for element in one:
            if element in another:
                yield element
    for x1, x2 in zip(get_all_in(lst1, lst2), get_all_in(lst2, lst1)):
        if x1 != x2:
            return False
    return True


def sublist3(lst1, lst2):
    from collections import Counter
    c1 = Counter(lst1)
    c2 = Counter(lst2)
    for item, count in c1.items():
        if count > c2[item]:
            return False
    return True


l1 = ["a", "b", "c", "c", "c", "d", "e"]
l2 = ["c", "a", "c", "b", "c", "c", "d", "d", "f", "e"]

s1 = lambda: sublist1(l1, l2)
s2 = lambda: sublist2(l1, l2)
s3 = lambda: sublist3(l1, l2)

from timeit import Timer
t1, t2, t3 = Timer(s1), Timer(s2), Timer(s3)
print(t1.timeit(number=10000))  # => 0.034193423241588035
print(t2.timeit(number=10000))  # => 0.012621842119714115
print(t3.timeit(number=10000))  # => 0.12714286673722477

His 2nd way is faster by an order of magnitude, but I wanted to mention the Counter variant because of its prevalence and usage outside of this scenario.

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
Goodies
  • 4,439
  • 3
  • 31
  • 57
6

b = sublist and a = list then search b by splitting a in lengths of b

e.g.

>>> a = [2,4,3,5,7] , b = [4,3]
>>> b in [a[i:len(b)+i] for i in xrange(len(a))]
True

>>> a = [2,4,3,5,7] , b = [4,10]
>>> b in [a[i:len(b)+i] for i in xrange(len(a))]

False
gehbiszumeis
  • 3,525
  • 4
  • 24
  • 41
codename47
  • 61
  • 1
  • 2
4

Memory efficient solution based on M. Morgan's answer. Takes into consideration that in order to be a sublist, the sublist must be found in the same order in the super list.

Variable k keeps track of the length of matched characters. When this matches the length of our sublist we can return true.

Variable s keeps track of the starting value. I keep track of this so that a test case like sublist(["1", "1", "2"],["0", "1", "1", "1", "2", "1", "2"]) with extraneous repeats of the first entry don't affect the current index reset when unmatched. Once the starting value changes s becomes irrelevant so this case does not fire in the middle of a pattern.

def sublist(sublist, lst):
    if not isinstance(sublist, list):
        raise ValueError("sublist must be a list")
    if not isinstance(lst, list):
        raise ValueError("lst must be a list")

    sublist_len = len(sublist)
    k=0
    s=None

    if (sublist_len > len(lst)):
        return False
    elif (sublist_len == 0):
        return True

    for x in lst:
        if x == sublist[k]:
            if (k == 0): s = x
            elif (x != s): s = None
            k += 1
            if k == sublist_len:
                return True
        elif k > 0 and sublist[k-1] != s:
            k = 0

    return False
Dave Thomas
  • 3,667
  • 2
  • 33
  • 41
4

what's wrong with the following:

def sublist(lst1, lst2):
return all([(x in lst2) for x in lst1])

will return true if for all items in lst1, each item exists in lst2

3
def sublist(l1,l2):
    s1=" ".join(str(i) for i in l1)
    s2=" ".join(str(i) for i in l2)
    if s1 in s2:
        return True
    else:
        return False
  • To answer this question, you cannot assume the lists are able to be converted into strings. If that was the case, you can use `l1 in l2` directly. – TimZaman Feb 05 '18 at 22:31
  • I don't believe in can be used directly . When I checked `["a","b","c"] in ["w","a","b","c"]` in Python 3.8 returns `false`. Perhaps it used to work, but not anymore. – Joseph Cottam Jan 13 '21 at 04:13
2

I found the above all found ['a','b','d'] to be a sublist of ['a','b','c','e','d'], which may not be true in spite of all of the elements of the sublist being present in the list. So to maintain the order and I came up with:

def sublist4(sublist,lst):
    #Define an temp array to populate 
    sub_list=[]
    comparable_sublist=[]
    #Define two constants to iterate in the while loop
    i=0
    k=0
    #Loop the length of lst
    while i < len(lst):
        #If the element is in the sublist append to temp array, 
        if k < len(sublist) and lst[i] == sublist[k]:
            sub_list.append(lst[i])
            #set a comparable array to the value of temp array
            comparable_sublist = sub_list
            k += 1
            #If the comparable array is the same as the sublist, break
            if len(comparable_sublist) == len(sublist):
                break

        #If the element is not in the sublist, reset temp array
        else:
            sub_list = []


        i += 1

    return comparable_sublist == sublist

Whilst this isn't very memory efficient, I find it works quite well with small lists.

M. Morgan
  • 93
  • 7
  • You need to consider also resetting the variable ```k```. ```sublist4([1,2,4],[1,2,7,4,1,2,4])``` fails. You are on the right track with taking order into consideration. Looking at the original question the asker did mention that the sublist had to be in order. – Dave Thomas Jan 19 '18 at 16:36
2
def has_ordered_intersection(xs, ys):
    common = {*xs} & {*ys}
    return all(x == y for x, y in zip((x for x in xs if x in common),
                                      (y for y in ys if y in common)))

This passes @L3viathan's doctest with fewer lines of code, using a similar strategy to the "memory-efficient variant", and with arguably greater overall efficiency.

>>> has_ordered_intersection([], [1,2,3])
True
>>> has_ordered_intersection([1,2,3,4], [2,5,3])
True
>>> has_ordered_intersection([1,2,3,4], [0,3,2])
False
>>> has_ordered_intersection([1,2,3,4], [1,2,5,6,7,8,5,76,4,3])
False

I used the intersection set instead of a generator because I think the extra memory is a good tradeoff compared to the time cost of shortcut-scanning the entire list per element (what in does to a list), especially if they are long.

I also don't think this should be called a "sublist" since xs is allowed to have elements that ys does not. The above relation is symmetric: swapping the arguments doesn't change the answer. A real ordered "sublist" would not be symmetric and look more like this

def is_ordered_sublist(xs, ys):
    xset = {*xs}
    return all(x == y for x, y in zip(xs, (y for y in ys if y in xset)))
gilch
  • 10,813
  • 1
  • 23
  • 28
  • I agree with the usage of the sets. Almost everything else posted in this thread has O(n^2) runtime. – Dennis Jan 26 '20 at 01:24
2

Find in l1 all indexes where the element match with the first element in l2, then I loop over this indexes list and for each element get the slice of l1 with the same length of l2. If the l1 slice is equal to l2, then l2 is a sublist of l1

Ex:

l1 = [1,2,3,2,1,1,3,3,4,5]

l2 = [2,1,1,3,3]

True

l1 = [1,2,3,2,1,3,3,4,5]

l2 = [2,1,1,3,3]

False

def is_sublist(l1, l2):
    index_list = [i for i, v in enumerate(l1) if v==l2[0]]
    for ii in index_list:
        l1_slice = l1[ii:ii+len(l2)]
        if l1_slice == l2:
            return True
    else:
        return False
  • Hi @Yunnosch. My goal is not to convince you or anyone. I just post my solution to a problem and I would expect that if it has errors this community helps to fix it or give a better solution. However the explanation to my solution is: Find in l1 all indexes where the element match with the first element in l2, then I loop over this indexes list and for each element get the slice of l1 with the same length of l2. If the l1 slice is equal to l2, then l2 is a sublist of l1. This is the core code for the problem, you can enrich it validating corner cases. Greetings – user13026656 Mar 08 '20 at 16:13
  • Hi @Yunnosch I don't know why you are using words like pseudo-altruistic. If my previous response bothered you in any way, I apologize, it was not my intention. I think the main objective of this platform is to share knowledges, not to judge or be rude about something. I consider myself an student in many ways and this is also my first post, what I mean is that I'm learning how to do it, in my second edition of the post was that I realize how to format the code. Surely I will consider your recommendations for future posts. Regards – user13026656 Mar 08 '20 at 21:23
  • The idea and goal of the platform *is* to share knowledge. You are right. The intended way to share is however to create a collection of useful Q/A pairs. Questions are only considered useful if they are on-topic. Answers are usually giving a more helpful impression with explanations. Without explanations, they give the wrong kind of impression and do NOT support knowledge sharing. I used "pseudo-altruistic" to describe the idea "I do not bother about earning reputation and just give my spare time to anybody freely." which I think you are having. Otherwise please accept my apology. – Yunnosch Mar 08 '20 at 21:39
  • And I do like your added explanation. Kind of short, but good enough to support learning on top of, or even instead of, blind reuse. Looking forward to more of your contribution. Have fun. – Yunnosch Mar 08 '20 at 21:44
1

Its easy with iterators.

>>> a = [0,1,2]
>>> b = [item for item in range(10)]
>>> b
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a
[0, 1, 2]
>>> [False, True][set([item in b for item in a]) == set([True])]
True
>>> a = [11, 12, 13]
>>> [False, True][set([item in b for item in a]) == set([True])]
False
Umair Bhatti
  • 117
  • 1
  • 4
1

Try this one!! sublist y is not missing the sequence of list x.

x= list

y= sublist

if ([i for i,j in enumerate(y) for k,l in enumerate(x) if i == k and j!=l]):
    print("True")
else:
    print("False")
1

I have come up with a short way to check for sublist

lst1=[1,2,5,6,8,3,2,34,3,4]
lst2=[1,2,3,4]


def sublist(lst1,lst2):
    for item in lst2:
        try:
           lst1.index(item)
        except ValueError:
           return False
     return True


 print(sublist(lst1,lst2))

what I have done is basically take 2 lists lst1 is the larger list and lst2 is the sublist that we are checking for. then I am taking each element of the lst2 and checking if it is in the lst1 by looking for its index

if it can't find even a single item it ..returns False

if all the items are covered it returns True

Mohan Ram
  • 11
  • 2
1

Another way is to move through all possible sublists and return once a match was found

def is_sublist(ys, xs):
    for i in range(len(xs) - len(ys)):
        if xs[i:i + len(ys)] == ys:
            return True
    return False
Jarno
  • 6,243
  • 3
  • 42
  • 57
1

This code attempts to find list1 in list2 by by scanning list2. It searches list2 for the first item in list1 and then checks to see if successive items in list1 also match at the location in list2 where the first item is found. If the the first 2/4 items in list1 match at a location in list2 but the 3rd does not then it will not spend time comparing the 4th.

def ordered_sublist(l1, l2):
    length = len(l1)
    for i in range(len(l2) - length + 1):
        if all(l1[j] == l2[j + i] for j in range(length)):
            return True
    return False
Ryan Codrai
  • 172
  • 8
1
def lis1(item,item1):
    sub_set = False
    for x in range(len(item)):
     if item[x] == item1[0]:
         n = 1
         while (n < len(item1) and (item[x + n] == item1[1])):
             n += 1
             if n == len(item1):
                 return True
    return False
a = [2,3,4,5,6]
b = [5,6]
c = [2,7,6]
print(lis1(a,b))
print(lis1(a,c))
1

Lists are data structures where the order of the elements matters.

I understand that this question explicitly specifies "same order of indexes" but in general, when you say "sublist", this is not necessarily the only restriction that applies. The relative position between each element may also be a restriction.

In my particular case list1=[1,2,3,4] list2=[1,2,4] and list2 is not a sublist of list1, but list3=[2,3,4] is a sublist of list1.

Just for the sake of completion, I am posting here my code to find sublists where the relative index of each element also should be preserved.

def is_sublist(list1, list2):
    first_index = -1
    for i in range(len(list1)):
        if first_index>=0:
            j = i-first_index
            if list1[i] != list2[j]:
                return False
            if j == len(list2)-1:
                return True
         elif list1[i] == list2[0]:
            first_index = i
    return False
print(is_sublist(['r1','r2','r3','r4','r6'],['r1','r2','r3']))
#>> True
print(is_sublist(['r1','r2','r3','r4','r6'],['r2','r3','r4']))
#>> True
print(is_sublist(['r1','r2','r3','r4','r6'],['r1','r2','r4']))
#>> False
Chocksmith
  • 1,188
  • 2
  • 12
  • 40
0
#list1 = ['1','2',"4"]############works
#list2 = ['1','2',3]

lst2 = [4,8,9,33,44,67,123]
lst1 = [8,33,7] # works!


def sublist(lst1, lst2):
    'checks whether list lst1 is a sublist of list lst2'
    index1 = 0  # lst1 index
    index2 = 0  # lst2 index

    # go through indexes of lst1
    while index1 < len(lst1):

        # search for item in lst2 matching item in lst1 at index index1
        while index2 < len(lst2) and lst1[index1] != lst2[index2]:
            index2 += 1

        # if we run out of items in lst2, lst1 is not a sublist of lst2
        if index2 == len(lst2):
            return False
        index1 += 1

    # every item in lst1 has been matched to an item in lst2, from left to right
    return True

print( sublist(lst1, lst2))
Ed S
  • 385
  • 8
  • 31
0

I needed to know if the first list is the sub-list of the second one. This order was important to me. I've tried some of the solutions, but they are too 'generic' for my needs. I also wanted to make sure, that both lists are not equal. Here's the solution.

def sublist(lst1, lst2):
    len1 = len(lst1)
    len2 = len(lst2)

    if len1 >= len2:
        return False

    for i in range(0, len1):
        if lst1[i] != lst2[i]:
            return False

    return True
ashrasmun
  • 490
  • 6
  • 11
0

How about just using an index that runs along list2 as we do the comparison?

def is_ordered_sublist(lst1: list, lst2: list) -> bool:
""" Checks if lst1 is an ordered sublist of lst2 """

    try:
        index = 0
        for item in lst1:
            location = lst2[index:].index(item)
            index += location + 1
        return True
    except ValueError:
        return False

Basically for each item in list1 it simply finds the first index which it appears in the second list. Thereafter it only needs to consider the remaining parts of list2. So the worse case complexity is simple O(len(list2)).

Yoskutik
  • 1,859
  • 2
  • 17
  • 43
0

I think this is the best way to solve this problem. This will check if list1 is a sublist of list2. We will assume that all elements are unique. If we have duplicate elements the following code will only ensure that each element of list1 is contained in list2. Hence, we do not take multiplicity into account.

list1 = [2, 3, 3, 4, 5, 9]
list2 = [1, 2, 3, 4, 5, 6, 7, 8, 9]

set(list1).issubset(set(list2))
MachineLearner
  • 805
  • 1
  • 8
  • 22
0

Definition:

  • List A is a sublist of list B if the exact sequence of elements of A exists in B.
  • An empty list is a sublist of any list.

The following function returns the index of the first occurrence of list_a in list_b, otherwise -1 is returned. For empty list_a, 0 is returned.

def sublist(list_a, list_b):
    if 0 == len(list_a):
        return 0

    if len(list_b) < len(list_a):
        return -1

    idx = -1
    while list_a[0] in list_b[idx+1:]:
        idx = list_b.index(list_a[0], idx + 1)
        if list_a == list_b[idx:idx+len(list_a)]:
            return idx

    return -1

Some tests:

>>> sublist([], [])
0
>>> sublist([], [1, 2, 3])
0
>>> sublist([3, 6], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
2
>>> sublist([3, 7, 9, 8], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
4
>>> sublist([3, 6, 3, 7, 9, 8, 0, 3, 6], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
2
>>> sublist([1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6, 4], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
-1
>>> sublist([3, 7, 4], [1, 2, 3, 6, 3, 7, 9, 8, 0, 3, 6])
-1
handsomeyang
  • 301
  • 2
  • 2
0

Here's a lazy-iteration, generic version for checking whether an iterable is a subsequence of another iterable:

from typing import Iterable

def is_subsequence(a: Iterable, b: Iterable) -> bool:
    b_iterator = iter(b)
    for x in a:
        for y in b_iterator:
            if y == x:
                break
        else:
            return False
    else:
        return True
Anakhand
  • 2,838
  • 1
  • 22
  • 50
  • This answer is wrong. Try `is_subsequence([1, 2, 3], [1, 1, 0, 2, 3])`. – DanielSank Jun 19 '22 at 07:43
  • @DanielSank Returns `True` when I try it, which is correct: indices 0/1, 3,4 of the larger list form the subsequence `[1,2,3]`. – Anakhand Jun 23 '22 at 06:03
  • Ah, I interpreted the question differently. I thought we would return `True` only when the sublist exists without "gaps" in the larger list. – DanielSank Jun 24 '22 at 04:11