Check for presence of a sliced list in Python

Question

I want to write a function that determines if a sublist exists in a larger list.

list1 = [1,0,1,1,1,0,0]
list2 = [1,0,1,0,1,0,1]

#Should return true
sublistExists(list1, [1,1,1])

#Should return false
sublistExists(list2, [1,1,1])

Is there a Python function that can do this?

Ah - I see the gotcha here. You are not looking for something being a subset of the other set - but that it must match in order a slice of the list. — Danny Staple, Nov 26 '13 at 14:50
See also answer using KMP (Knuth-Morris-Pratt) algorithm: [python - Best way to determine if a sequence is in another sequence? - Stack Overflow](https://stackoverflow.com/questions/425604/best-way-to-determine-if-a-sequence-is-in-another-sequence) — user202729, Dec 05 '21 at 18:08

score 47 · Answer 1 · edited Mar 05 '23 at 14:42

47

Let's get a bit functional, shall we? :)

def contains_sublist(lst, sublst):
    n = len(sublst)
    return any((sublst == lst[i:i+n]) for i in range(len(lst)-n+1))

Note that any() will stop on first match of sublst within lst - or fail if there is no match, after O(m*n) ops

edited Mar 05 '23 at 14:42

Błażej Michalik

4,474
40
55

answered Jul 23 '10 at 02:10

Nas Banov

28,347
6
48
67

Mark Byers · Accepted Answer · 2010-07-23T07:41:48.883

22

If you are sure that your inputs will only contain the single digits 0 and 1 then you can convert to strings:

def sublistExists(list1, list2):
    return ''.join(map(str, list2)) in ''.join(map(str, list1))

This creates two strings so it is not the most efficient solution but since it takes advantage of the optimized string searching algorithm in Python it's probably good enough for most purposes.

If efficiency is very important you can look at the Boyer-Moore string searching algorithm, adapted to work on lists.

A naive search has O(n*m) worst case but can be suitable if you cannot use the converting to string trick and you don't need to worry about performance.

edited Jul 23 '10 at 07:41

answered Jul 22 '10 at 21:22

Mark Byers

811,555
193
1,581
1,452

4

`--` : the code is seriously broken, try `sublistExists([10], [1,0])` == True?! – Nas Banov Jul 23 '10 at 01:46
13

@Nas Banov: That's why Mark wrote in his first sentence "If you are sure that your inputs will only contain single characters '0' and '1'..." – Tim Pietzcker Jul 23 '10 at 06:03
1

@Tim: But the inputs don't contain "single characters '0' and '1'", mind you! The example shown contains only the numbers `0` and `1` (or "digits" if you will). :) Besides, his code is slightly more broad - it will handle correct any list of 1-chars or any list of 1-digit numbers (but not both). And it's fairly easy to fix by introducing separator to `str.join` – Nas Banov Jul 23 '10 at 07:33
I agree with you about Boyer-Moore. I've posted an answer with an implementation. – Mar 02 '17 at 23:14
@Nas Banov Just to expand on/reiterate your comment, if you replace "" with , it works. So you do need to determine a separator based on the data, but you don't necessarily have to restrict inputs to single characters. – Chris Coffee Nov 05 '22 at 02:21

score 4 · Answer 3 · answered Jul 23 '10 at 01:14

4

No function that I know of

def sublistExists(list, sublist):
    for i in range(len(list)-len(sublist)+1):
        if sublist == list[i:i+len(sublist)]:
            return True #return position (i) if you wish
    return False #or -1

As Mark noted, this is not the most efficient search (it's O(n*m)). This problem can be approached in much the same way as string searching.

answered Jul 23 '10 at 01:14

sas4740

4,510
8
26
23

3

You should probably avoid using the keyword `list` as a variable name. – Luci Sep 26 '17 at 13:31

score 4 · Answer 4 · answered Mar 26 '19 at 16:50

My favourite simple solution is following (however, its brutal-force, so i dont recommend it on huge data):

>>> l1 = ['z','a','b','c']
>>> l2 = ['a','b']
>>>any(l1[i:i+len(l2)] == l2 for i in range(len(l1)))
True

This code above actually creates all possible slices of l1 with length of l2, and sequentially compares them with l2.

Detailed explanation

Read this explanation only if you dont understand how it works (and you want to know it), otherwise there is no need to read it

Firstly, this is how you can iterate over indexes of l1 items:

>>> [i for i in range(len(l1))]
[0, 1, 2, 3]

So, because i is representing index of item in l1, you can use it to show that actuall item, instead of index number:

>>> [l1[i] for i in range(len(l1))]
['z', 'a', 'b', 'c']

Then create slices (something like subselection of items from list) from l1 with length of2:

>>> [l1[i:i+len(l2)] for i in range(len(l1))]
[['z', 'a'], ['a', 'b'], ['b', 'c'], ['c']] #last one is shorter, because there is no next item.

Now you can compare each slice with l2 and you see that second one matched:

>>> [l1[i:i+len(l2)] == l2 for i in range(len(l1))]
[False, True, False, False] #notice that the second one is that matching one

Finally, with function named any, you can check if at least one of booleans is True:

>>> any(l1[i:i+len(l2)] == l2 for i in range(len(l1)))
True

score 3 · Answer 5 · edited Jan 29 '23 at 20:21

The efficient way to do this is to use the Boyer-Moore algorithm, as Mark Byers suggests. I have done it already here: Boyer-Moore search of a list for a sub-list in Python, but will paste the code here. It's based on the Wikipedia article.

The search() function returns the index of the sub-list being searched for, or -1 on failure.

def search(haystack, needle):
    """
    Search list `haystack` for sublist `needle`.
    """
    if len(needle) == 0:
        return 0
    char_table = make_char_table(needle)
    offset_table = make_offset_table(needle)
    i = len(needle) - 1
    while i < len(haystack):
        j = len(needle) - 1
        while needle[j] == haystack[i]:
            if j == 0:
                return i
            i -= 1
            j -= 1
        i += max(offset_table[len(needle) - 1 - j], char_table.get(haystack[i]));
    return -1

    
def make_char_table(needle):
    """
    Makes the jump table based on the mismatched character information.
    """
    table = {}
    for i in range(len(needle) - 1):
        table[needle[i]] = len(needle) - 1 - i
    return table
    
def make_offset_table(needle):
    """
    Makes the jump table based on the scan offset in which mismatch occurs.
    """
    table = []
    last_prefix_position = len(needle)
    for i in reversed(range(len(needle))):
        if is_prefix(needle, i + 1):
            last_prefix_position = i + 1
        table.append(last_prefix_position - i + len(needle) - 1)
    for i in range(len(needle) - 1):
        slen = suffix_length(needle, i)
        table[slen] = len(needle) - 1 - i + slen
    return table
    
def is_prefix(needle, p):
    """
    Is needle[p:end] a prefix of needle?
    """
    j = 0
    for i in range(p, len(needle)):
        if needle[i] != needle[j]:
            return 0
        j += 1    
    return 1
    
def suffix_length(needle, p):
    """
    Returns the maximum length of the substring ending at p that is a suffix.
    """
    length = 0;
    j = len(needle) - 1
    for i in reversed(range(p + 1)):
        if needle[i] == needle[j]:
            length += 1
        else:
            break
        j -= 1
    return length

Here is the example from the question:

def main():
    list1 = [1,0,1,1,1,0,0]
    list2 = [1,0,1,0,1,0,1]
    index = search(list1, [1, 1, 1])
    print(index)
    index = search(list2, [1, 1, 1])
    print(index)

if __name__ == '__main__':
    main()

Output:

2
-1

score 1 · Answer 6 · answered Jul 23 '10 at 04:25

1

Here is a way that will work for simple lists that is slightly less fragile than Mark's

def sublistExists(haystack, needle):
    def munge(s):
        return ", "+format(str(s)[1:-1])+","
    return munge(needle) in munge(haystack)

answered Jul 23 '10 at 04:25

John La Rooy

295,403
53
369
502

@e1i45, have _you_ tried it? What happens when the items in s aren't strings? – John La Rooy Feb 21 '13 at 12:43
DELIMITER.join(str(x) for x in xs) might work. But maybe it is slower than format? – e1i45 Mar 11 '13 at 10:46

SuperNova · Answer 7 · 2016-04-18T13:01:43.670

def sublistExists(x, y):
  occ = [i for i, a in enumerate(x) if a == y[0]]
  for b in occ:
      if x[b:b+len(y)] == y:
           print 'YES-- SUBLIST at : ', b
           return True
      if len(occ)-1 ==  occ.index(b):
           print 'NO SUBLIST'
           return False

list1 = [1,0,1,1,1,0,0]
list2 = [1,0,1,0,1,0,1]

#should return True
sublistExists(list1, [1,1,1])

#Should return False
sublistExists(list2, [1,1,1])

score 0 · Answer 8 · answered Aug 11 '16 at 01:31

0

Might as well throw in a recursive version of @NasBanov's solution

def foo(sub, lst):
    '''Checks if sub is in lst.

    Expects both arguments to be lists
    '''
    if len(lst) < len(sub):
        return False
    return sub == lst[:len(sub)] or foo(sub, lst[1:])

answered Aug 11 '16 at 01:31

wwii

23,232
7
37
77

Recursion... Can cause a stack overflow on long lists – Tigran Saluev Nov 29 '16 at 14:55
@TigranSaluev - stack overflow or maximum recursion depth or RecursionError? – wwii Nov 29 '16 at 18:37
1

RuntimeError: maximum recursion depth exceeded in cmp – Tigran Saluev Nov 30 '16 at 09:57
"Might as well"--hmm, why, exactly? This recursive approach seems to have no redeeming qualities compared to the iterative version. It seems longer, less efficient, more error-prone, and less understandable. (I have nothing against recursion in general.) – Joshua P. Swanson Mar 06 '17 at 00:50
@wwii: Alrighty :) I was wondering if you had a particular reason to do it recursively, but it seems it was just because it could be done. Given the recursion depth issue in particular, it does seem like a bad solution. – Joshua P. Swanson Mar 07 '17 at 05:37

score 0 · Answer 9 · answered Sep 15 '17 at 14:15

0

def sublist(l1,l2):
  if len(l1) < len(l2):
    for i in range(0, len(l1)):
      for j in range(0, len(l2)):
        if l1[i]==l2[j] and j==i+1:
        pass
      return True
  else:
    return False

answered Sep 15 '17 at 14:15

Ashutosh K Singh

269
4
9

score -2 · Answer 10 · answered Mar 02 '21 at 04:26

I know this might not be quite relevant to the original question but it might be very elegant 1 line solution to someone else if the sequence of items in both lists doesn't matter. The result below will show True if List1 elements are in List2 (regardless of order). If the order matters then don't use this solution.

List1 = [10, 20, 30]
List2 = [10, 20, 30, 40]
result = set(List1).intersection(set(List2)) == set(List1)
print(result)

Output

True

score -4 · Answer 11 · answered Jul 23 '10 at 06:00

if iam understanding this correctly, you have a larger list, like :

list_A= ['john', 'jeff', 'dave', 'shane', 'tim']

then there are other lists

list_B= ['sean', 'bill', 'james']

list_C= ['cole', 'wayne', 'jake', 'moose']

and then i append the lists B and C to list A

list_A.append(list_B)

list_A.append(list_C)

so when i print list_A

print (list_A)

i get the following output

['john', 'jeff', 'dave', 'shane', 'tim', ['sean', 'bill', 'james'], ['cole', 'wayne', 'jake', 'moose']]

now that i want to check if the sublist exists:

for value in list_A:
    value= type(value)
    value= str(value).strip('<>').split()[1]
    if (value == "'list'"):
        print "True"
    else:
        print "False"

this will give you 'True' if you have any sublist inside the larger list.

Check for presence of a sliced list in Python

11 Answers11

Detailed explanation

Linked

Related