comparing two strings and their index seeing if all of string 2 is in string 1 with same order but different letter in between

Question

Assume all letters in the given words are lowercase. You may use string methods if any are useful.

>>> occurs_within('aaacaabxyt', 'cat')
True

>>> occurs_within('tac', 'cat')
False

>>> occurs_within('oboe', 'bob')
False

>>> occurs_within('ecxbtalt', 'exalt')
True

>>> occurs_within('ecxbtal', 'exalt')
False

if len(word1) > len(word2):
    for i in range(0,len(word1)):
        if word2[i] < word2[i+1]:
            return True
        else:
            return False
else:
    return False

I tried using a nested for loop inside an if function comparing the two strings and looping to find if the next index of string 2 that is with in string 1 is greater than that of the previous index, but it turns out my result are all True somehow, and I cant figure out what seem to be the problem.

Does this answer your question? [How to test if one string is a subsequence of another?](https://stackoverflow.com/questions/24017363/how-to-test-if-one-string-is-a-subsequence-of-another) — Kelly Bundy, Feb 14 '20 at 21:19

score 2 · Answer 1 · answered Feb 14 '20 at 21:25

2

Walrus practice :-)

def occurs_within(s, t):
    i = 0
    return all((i := s.find(c, i) + 1) for c in t)

answered Feb 14 '20 at 21:25

Kelly Bundy

23,480
7
29
65

Wouldn't it **always** return `True`? I mean, str.find returns -1 on failure… – Błotosmętek Feb 14 '20 at 21:31
@Błotosmętek No, it returns `False` for example for `occurs_within('ab', 'ba')`. Not sure how you (and the downvoter?) misunderstand the code. – Kelly Bundy Feb 14 '20 at 21:33
@HeapOverflow Not sure why this is downvoted. But this problem can be solved in `O(n)` time. – Ch3steR Feb 14 '20 at 21:47
@Ch3steR Not sure why you're saying it like that. My solution is O(n). And probably quite fast, as it lets the string search itself (i.e., fast C code). – Kelly Bundy Feb 14 '20 at 21:49
@HeapOverflow OK, I got it - `-1 + 1` is `False` :-) I guess I am not yet familiar enough with the walrus. – Błotosmętek Feb 14 '20 at 21:50
@Błotosmętek Hmm, I was kinda hoping you were the downvoter and would take it back after realizing the mistake. Not sure how my solution isn't useful. – Kelly Bundy Feb 14 '20 at 21:52
@HeapOverflow `s.find` iterates through `s` until `i` in every iteration right? – Ch3steR Feb 14 '20 at 21:52
@Ch3steR No, not *until* `i` but *from* `i`. – Kelly Bundy Feb 14 '20 at 21:53
@HeapOverflow nope, wasn't me - I don't downvote unless I am absolutely certain the answer is wrong. – Błotosmętek Feb 14 '20 at 21:53
@Błotosmętek Your answer and heap's answer uses the same logic. But he incorporated it with walrus assignment. – Ch3steR Feb 14 '20 at 21:56
@HeapOverflow In worst case say `string='aaabbcaat'` and `seq='uvw'`. `.find` searches the whole `string` in every iteration right? Once take a look at my answer. Not telling to upvote my answer. – Ch3steR Feb 14 '20 at 22:00
@Ch3steR Did a little benchmark, for the given examples our two solutions were about equally fast, but for `'a'*1000 + 'foo', 'foo'`, mine is about 100 times faster. – Kelly Bundy Feb 14 '20 at 22:00
@Ch3steR yes, I know. What misled me was adding 1 to the return value of find immediately, so that the resulting value is 0 if find failed, – Błotosmętek Feb 14 '20 at 22:01
@Ch3steR No, in that example it would only search `u`. Since that fails, `all` will stop right away. – Kelly Bundy Feb 14 '20 at 22:02
@HeapOverflow At this point I'm confused. Did the use of walrus assignment make your solution faster? – Ch3steR Feb 14 '20 at 22:03
@HeapOverflow In my solution, I iterate through even if there's a mismatch. Your solution immediately breaks when there's a mismatch. – Ch3steR Feb 14 '20 at 22:05
@HeapOverflow Better solutions deserve +1. – Ch3steR Feb 14 '20 at 22:06
@Ch3steR Yes, although in my `'a'*1000 + 'foo', 'foo'` example that doesn't happen. Mine is faster than yours on that one because `str.find` is much faster than searching yourself in Python one char at a time :-) – Kelly Bundy Feb 14 '20 at 22:08
@HeapOverflow Agreed. Hope downvoters retracts the downvote. Maybe the downvoter did not understand your code or misinterpreted it. – Ch3steR Feb 14 '20 at 22:13
@HeapOverflow How can we rewrite this code in python-3.7 or less? – Ch3steR Feb 14 '20 at 22:16
@Ch3steR Yeah, apparently it's easy to misunderstand :-P. At least the algorithm, and maybe also the usage of the walrus. I've written the exact same algorithm at least once before, so for me it's very familiar and this was literally just walrus practice. – Kelly Bundy Feb 14 '20 at 22:17
@Ch3steR Previously I wrote it with a for-loop. Like Błotosmętek's but with the immediate `+ 1`. I guess `itertools.accumulate would work, too. – Kelly Bundy Feb 14 '20 at 22:20
@Ch3steR Meh... I managed to write an `accumulate` solution, but it's ugly and it uses another 3.8 feature (the `initial` argument): `all(islice(accumulate(t, lambda i, c: s.find(c, i) + 1, initial=0), 1, None))`. – Kelly Bundy Feb 14 '20 at 22:31

score 1 · Answer 2 · answered Feb 14 '20 at 21:15

To find if a word word is - for a lack of better word - "embedded" in a string s you need to find the position of the first letter of the word in the string, then the second letter in the remainder of the string (i.e. past the position of the first letter), etc. If at any stage you fail to find the next letter, you return False. This code does it:

def occurs_within(s, word):
    pos = 0
    for letter in word:
        i = s.find(letter, pos)
        if i == -1:
            return False
        pos = i+1
    return True

jmunsch · Answer 3 · 2020-02-14T21:33:52.510

Let's "rubber duck":

# if word 1 is longer than word two
if len(word1) > len(word2):
   # lets `enumerate()` or look at each index of word 1 , 
   # and see if it is in word 2
    for i in range(0,len(word1)):
        # if the letter of word2 at index "i" is less than
        # the letter of word2 at "i + 1" then return true
        #############
        # notice you're trying to compare using a relational operator "<"
        # which would be 
        # "a" < "a" == False 
        # "b" < "a" == False
        # "a" < "b" == True
        # for example:
        #     'oboe', 'bob'
        #     (w1 letter, index, w2 letter)
        #     o, 0, b < o == True
        #     b, 1, o < b == False
        #     o, 2, b < None
        #     e, 3, None < None
        # this doesn't work because word2 isn't alphabetical.
        # and the len() of word2 in these examples are not longer than     
        # word1. For example ["t", "a", "c"][2] and ["c", "a", "t"][2+1] 
        if word2[i] < word2[i+1]:
            # end the function by returning true
            # the loop will not continue to check the rest of the word
            return True
        else:
            return False
# word 1 is not longer than word 2
else:
    return False

see:

Why are your comments comparing letters from w1 and w2 at the same index when the actual code compares letters of only w2 and at different indices? And why are you discussing more than one comparison when there's a `return` after the first already? — Kelly Bundy, Feb 14 '20 at 21:30
@HeapOverflow thanks updated. see: `for i in range(0,len(word1)):` w1 index is being used to index w2. — jmunsch, Feb 14 '20 at 21:34
Ah, I misread. Saw the `None`s now. What you call "w2 letter" is not really a w2 letter but a comparison, and it looked like a comparison of `word2[i]` and `word1[i]` to me. Doesn't help that the chosen example does make it look like that for the first two iterations :-) — Kelly Bundy, Feb 14 '20 at 22:44

malmiteria · Answer 4 · 2020-02-14T21:16:57.863

0

you can use regex matching for that.

import re

def occurs_within(word1, word2):
    #build regex, basically inserting '.*' around any character in word2. In a regex, '.*' means any character any amount of time (even 0)
    regex = '.*{}.*'.format('.*'.join(list(word2)))

    #matching
    return bool(re.match(regex, word1))

edited Feb 14 '20 at 21:16

answered Feb 14 '20 at 21:11

malmiteria

119
2
8

score 0 · Answer 5 · answered Feb 14 '20 at 21:17

You can iterate over the first word letter by letter, if you find that the current letter is the one in your second string keep it and go on.

def occurs_within(str1, str2):
    index = 0
    if len(str2) > len(str1):
        return False
    if str1 == str2:
        return True
    else:
        word = []
        index = 0
        for letter in str1:
            if index == len(str2):
                break
            if letter == str2[index]:
                index += 1
                word.append(letter)
        print(word)
        return ''.join(word) == str2

Ch3steR · Answer 6 · 2020-02-14T21:41:28.137

0

You can try this. Take O(n) time.

def func(a,b):
     a1=len(a)
     b1=len(b)
     i,j=0,0
     while(i<a1 and j<b1):
         if a[i]==b[j]:
             j=j+1
         i=i+1
     return j==b1

>>> func('aaacaabxyt', 'cat')
True
>>> func('tac', 'cat')
False
>>> func('oboe', 'bob')
False
>>> func('ecxbtalt', 'exalt')
True
>>> func('ecxbtal', 'exalt')
False

edited Feb 14 '20 at 21:41

answered Feb 14 '20 at 21:32

Ch3steR

20,090
4
28
58

score 0 · Answer 7 · answered Feb 14 '20 at 22:24

I have tested it with plenty examples of words and sub-words and this works for every one of them.

def occurs_within(word, sub_word):
    letter_at = -1
    for letter in sub_word:
        temp = letter_at
        letter_at = word.find(letter, temp + 1)
        if 0 < letter_at <= temp or letter_at == -1:
            return False
    return True

Basically the logic behind is that, in order for the function to returns True, you need to loop over sub_word and test, for each letter, if that letter exists in word and the index of the letter in that string (word) must be greater than the last letter of sub_word tested.

comparing two strings and their index seeing if all of string 2 is in string 1 with same order but different letter in between

7 Answers7