3

I have a list and want to check if a given sequence exist in the list or not.
For example the given sequence is 'a','h','z' and the list is l = ['a','b','h','g','z']. Given that in the list z comes after h after a, I need the code returns True value.

def check_for_a_h_z(seq):
    return ('a','h','z') in zip(seq, seq[1:], seq[2:])

The code return true if only 'a','h','z' are coming exactly after each other.

Engineero
  • 12,340
  • 5
  • 53
  • 75
user10316146
  • 115
  • 1
  • 6
  • 1
    I'm a fan of [this answer](https://stackoverflow.com/a/52709319/2336654) From that question referenced as the dup target. It works for strings, list of characters, list of strings. – piRSquared May 06 '19 at 21:14

4 Answers4

3

Back of the envelope brute force attempt at a generic solution for two sequences:

from typing import Sequence

def ordered_contains(s1: Sequence, s2: Sequence) -> bool:
  '''
  >>> ordered_contains('ahz', 'abhgz')
  True
  >>> ordered_contains('', 'asdlkjf')
  True
  >>> ordered_contains('lol', 'lofi')
  False
  '''
  i = 0
  for v in s1:
    try:
      i = s2.index(v, i)
    except ValueError:
      return False
  return True

kojiro
  • 74,557
  • 19
  • 143
  • 201
  • please run an example – user10316146 May 06 '19 at 20:22
  • @user10316146 OK, I've included [doctests](https://docs.python.org/3.7/library/doctest.html) to show the example you asked for. – kojiro May 06 '19 at 20:25
  • neat - O(n) at most? – Patrick Artner May 06 '19 at 20:30
  • 1
    @PatrickArtner I think if n=len(s1) and m=len(s2), then it's Θ(n+m). Please correct me if I'm wrong, I'm sloppy with algorithm complexity. – kojiro May 06 '19 at 20:33
  • 1
    I suppose `if len(s1) > len(s2): return False` would be a trivial optimization, but I'm not putting it in for SO aesthetic reasons. :) – kojiro May 06 '19 at 20:38
  • if nothing of s1 in in s2 it will search all of m, if there is anything of s1 in s2 it will search incrementally m also up to its maximum (if not found) or less (if found) - due to storing the last found index it is never re-searching over visited stuff in s2 - so I think it can not be more then m at most – Patrick Artner May 06 '19 at 20:38
  • 1
    @PatrickArtner The worst case is when all but the very last element of s1 is in s2, but the last one is not, right? In which case we iterate over all of s1 and also walk all of s2. For each step through s1, we scan at least one element of s2, up to len(s2). That's why I still think it's worst case Θ(n+m). – kojiro May 06 '19 at 20:45
  • You've convinced me. I agree that the worst case runtime is Θ(len(s2)) – kojiro May 06 '19 at 21:12
2

Here's a way to do it with recursion:

def list_sequence_recursive(test, lst):
    if len(test) == 1:
        return test[0] in lst
    else:
        if test[0] in lst:
            idx = lst.index(test[0]) + 1  # don't send the matched element
            return list_sequence_recursive(test[1:], lst[idx:])
        else:
            return False

test_sequence_recursive(['a', 'h', 'z'],
                        ['a','b','h','g','z'])
# True

Note we use lst.index(test[0]) + 1 for the next iteration's starting index so that we only send elements after the one that was matched. If you leave off the + 1, you would erroneously match your input list with, say ['a', 'a', 'h'] even though you only have one 'a'.

For your example, this would:

  1. find 'a' and then call itself with the arguments ['h', 'z'] and ['b','h','g','z']
  2. find 'h' and then call itself with the arguments ['z'] and ['g', 'z']
  3. find 'z' and return True up the chain
Engineero
  • 12,340
  • 5
  • 53
  • 75
2

This will return True if things is somehwere in order in seq in O(n) worst case:

def check_for_things(things,seq):
    k = things[:]
    for c in seq:
        if c == k[0]:
            k.pop(0)
        if not k:
            break

    return not k  # if you popped all from k you are done

print( check_for_things(list("ahz"), list('abhgz')))
print( check_for_things(list("ahiz"), list('abhgz')))

Output:

True  
False 

It will also produce True for list("ahz"), list('abhhhhhhhhhhgz') - the superflous h are ignored.


A more optimized solution (suggested by @Aran-Frey) would be using deque - popping elements anywhere then the end of a list is costly because the whole remaining list data is shifted over by 1 - deque can popleft (and popright) in O(1):

def better_check_for_things(things,seq):
    k = deque(things)
    for c in seq:
        if c == k[0]:
            k.popleft()
        if not k:
            break

    return not k  # if you popped all from k you are done

print( better_check_for_things(list("ahz"), list('abhgz')))
print( better_check_for_things(list("ahiz"), list('abhgz')))
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
-1

A somewhat intuitive way to do it, but not very Pythonic:

l=['a','b','h','g','z']

def check_for_a_h_z(seq):
    if 'a' in seq:
        ind=seq.index('a')
        if 'h' in seq[ind:]:
            ind2=seq.index('h')
            if 'z' in seq[ind2:]:
                return True
            else:
                return False
        else:
            return False
    else:
        return False


check_for_a_h_z(l)
Juan C
  • 5,846
  • 2
  • 17
  • 51