9

I know how to find whether a list is a subset of another list. But I was wondering how do you make sure that a list is an ordered subset of another list. By ordered subset I mean that the elements in both the list are in same order

For example if I have the following lists

A = [1,2,3]
B = [1,2]
C = [2,1]

Here B is an ordered subset of A but C is not (although A has all elements of C)

brokendreams
  • 827
  • 2
  • 10
  • 29
  • 4
    You could start by writing some code... – jonrsharpe Jan 04 '16 at 20:23
  • In general, `B` is a sublist of `A` iff `B` is empty or `A` contains the first item of `B` at some position `pos` and the part of `B` past the first item is an ordered sublist of `A` past `pos`. Can you translate that into Python? :) – CiaPan Jan 04 '16 at 20:24
  • Possible duplicate: http://stackoverflow.com/questions/3313590/check-for-presence-of-a-sublist-in-python – Krumelur Jan 04 '16 at 20:26
  • 2
    Can there be gaps? `[1,2,4]` is a subset of `[1,2,3,4]` with elements in the same order, but it is not a sub-"string", so to speak, because of the 3. What should your solution do in that case? – jez Jan 04 '16 at 20:27
  • @Jez Yes there can be gaps – brokendreams Jan 04 '16 at 22:42

4 Answers4

5
def is_sub(sub, lst):
    ln = len(sub)
    for i in range(len(lst) - ln + 1):
        if all(sub[j] == lst[i+j] for j in range(ln)):
            return True
    return False

Output:

In [21]: A = [4, 3, 2, 1, 2, 3]

In [22]: B = [1, 2]

In [23]: C = [2, 1]

In [24]: is_sub(B, A)
Out[24]: True

In [25]: is_sub(C, A)
Out[25]: True
In [26]: B = [10,11,12,13,14,15,25]
In [27]: is_sub(B, A)
Out[27]: False
In [39]: A = [1, 2, 1, 2, 1, 2, 3]
In [40]: B = [1, 2, 3]

In [41]: is_sub(B, A)
Out[41]: True

We don't need to worry about the sub being longer than lst as is it is stop will be less that start so we just return False.

You can combine it with any:

def is_sub(s, l):
    ln = len(s)
    return any((all(s[j] == l[i + j] for j in range(ln))
                for i in range(len(l) - ln + 1)))

Depending on your use case slicing might be faster:

def is_sub(sub, lst):
    ln = len(sub)
    return any(lst[i: i + ln] == sub for i in range(len(sub) - ln + 1))

If the elements can have gaps then it is a lot simpler, similar to kfx's you can do it in a single pass or less:

def is_sub_with_gap(sub, lst):
    ln, j = len(sub), 0
    for ele in lst:
        if ele == sub[j]:
            j += 1
        if j == ln:
            return True
    return False

The fundamental difference being what you consider a subset:

In [6]: a,b = [1,2,3,4], [1,2,4]

In [7]: is_sub(b,a)
Out[7]: False

In [8]: is_sub_with_gap(b,a)
Out[8]: True
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • Both follow-ups (`any` and slicing) will work but both are unnecessarily O(n^2) – jez Jan 04 '16 at 22:53
  • @jez, if the lists are small and the function is called a lot slicing is faster that is why I said *Depending on your use case slicing might be faster*. The first follow up any is exactly equivalent to the first function. – Padraic Cunningham Jan 04 '16 at 23:02
  • But even slicing involves looping over every element of B **for every** element of A, or nearly every element—in contrast to kfx's solution for example, which does the job with a single pass through A. [Ok I should have said O(n*m) rather than n^2] – jez Jan 04 '16 at 23:15
  • @jez, fkx's answer is based on a completely different premise to what the OP had in their question, there were no gaps in the OP's examples which is why I answered using the logic I did and in turn fkx's answer would not have worked in those cases as it would have returned False positives – Padraic Cunningham Jan 04 '16 at 23:16
  • is_sub with slicing - len should be of **lst** not sub - range(len(*sub*) - ln + 1) – AKS Mar 07 '21 at 11:17
4

Use a simple loop to iterate through both lists in order. At the end, check if all elements of the potential subset list have been seen.

i,j = 0,0
while i < len(A) and j < len(B):
    if A[i] == B[j]:
        j += 1
    i += 1

has_as_subset = (j == len(B))
kfx
  • 8,136
  • 3
  • 28
  • 52
1

Here's a version that allows gaps (i.e. is_sublist([1,2,4],[1,2,3,4]) returns True) in common with most but not all of the other posted solutions; it also handles sublists that start anywhere in the parent list (again, in common with most but not all of the other answers); also, it returns False if the sub-list candidate contains repetition that the master list does not (i.e. [1,2,2] is not a sub-list of [1,2,3]): again, this property is shared with most but not all of the other answers. (As far as I can tell the only other solution to hit all of these targets is that of kfx.)

def is_sublist( sublst, lst ):
    for element in sublst:
        try: ind = lst.index( element )
        except ValueError: return False
        lst = lst[ ind+1: ]
    return True
jez
  • 14,867
  • 5
  • 37
  • 64
0
a_i = 0
for b in B:
    try:
        if b == A[a_i]:
            a_i += 1
    except IndexError:
        return False
return True

The basic idea is that you shouldn't exhaust A before you exhaust B.

Tyler Crompton
  • 12,284
  • 14
  • 65
  • 94
  • Note that this returns `True` even if B contains repeated items that A does not. For example, if `A=[1,2,3]` and `B=[1,2,2]`. This may or may not be what the OP wants (unclear from question) – jez Jan 04 '16 at 20:46
  • True. It's unclear if OP means a true subset or not. – Tyler Crompton Jan 04 '16 at 23:13