6

I want to make a function which checks a string for occurrences of other strings within them.
However, the sub-strings which are being checked may be interrupted within the main string by other letters.

For instance:

a = 'abcde'
b = 'ace'
c = 'acb'

The function in question should return as b being in a, but not c.

I've tried set(a). intersection(set(b)) already, and my problem with that is that it returns c as being in a.

Prophet
  • 32,350
  • 22
  • 54
  • 79
Anon Not4Chan
  • 61
  • 1
  • 2
  • These type of strings are called [subsequences](http://en.wikipedia.org/wiki/Subsequence) of the longer string. – Lazer Sep 11 '10 at 12:59
  • This question is a special case of http://stackoverflow.com/questions/6877249/find-the-number-of-occurrences-of-a-subsequence-in-a-string The solutions there are much more efficient for solving this case as well. – Andrew Apr 06 '14 at 08:42

3 Answers3

11

You can turn your expected sequence into a regex:

import re

def sequence_in(s1, s2):
    """Does `s1` appear in sequence in `s2`?"""
    pat = ".*".join(s1)
    if re.search(pat, s2):
        return True
    return False

# or, more compactly:
def sequence_in(s1, s2):
    """Does `s1` appear in sequence in `s2`?"""
    return bool(re.search(".*".join(s1), s2))

a = 'abcde' 
b = 'ace' 
c = 'acb'

assert sequence_in(b, a)
assert not sequence_in(c, a)

"ace" gets turned into the regex "a.*c.*e", which finds those three characters in sequence, with possible intervening characters.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
5

how about something like this...

def issubstr(substr, mystr, start_index=0):
    try:
        for letter in substr:
            start_index = mystr.index(letter, start_index) + 1
        return True
    except: return False

or...

def issubstr(substr, mystr, start_index=0):
    for letter in substr:
        start_index = mystr.find(letter, start_index) + 1
        if start_index == 0: return False
    return True
Lord British
  • 405
  • 3
  • 10
3
def issubstr(s1, s2):
    return "".join(x for x in s2 if x in  s1) == s1

>>> issubstr('ace', 'abcde')
True

>>> issubstr('acb', 'abcde')
False
killown
  • 4,497
  • 3
  • 25
  • 29