1

I came across this code , that matches the * wildcard against a string. * can be considered as 0 or more characters.

def samePattern(main,pattern):
    if pattern=="*" or main==pattern:
        return True
    if main=="":
        return False
    if main[0:1]==pattern[0:1]:
        return samePattern(main[1:],pattern[1:])

    if pattern[0:1]=="*":
        return samePattern(main[1:],pattern) or samePattern(main,pattern[1:])

    return False

While I think I understood the base cases, I am not understanding how the line

if pattern[0:1]=="*":
        return samePattern(main[1:],pattern) or samePattern(main,pattern[1:])

is working.

Could anybody explain how it is working ?

3 Answers3

1

if pattern[0:1]=="*":

Above statement means when character of pattern variable at 0 index is '*' than condition is true and after that it run below conditional statement

return samePattern(main[1:],pattern) or samePattern(main,pattern[1:])

This statement call "samePattern" function in recursively and pass parameters as(value of main variable starting from index of 1 to n-1 characters, pattern variable)

0

'*' matches 0 or more characters. The last if statement says: if pattern is of the form '*' + p, then return True if either:

  • everything after the first character of main matches pattern (so the '*' at the head of pattern will potentially match more characters from the head of main), or
  • p, the tail of pattern, matches main (we're done with the '*')
BrianO
  • 1,496
  • 9
  • 12
  • 2
    `[0]` fails on an empty string where the slice is just empty, maybe that's why the original author used it. – jonrsharpe Oct 02 '15 at 05:48
  • @jonrsharpe Yep, you're right. I deleted the hasty falsehood. *However*, it looks like one simple change allows replacing `[0:1]` with `[0]`: change the second `if` to `if main == '' or pattern == '': return False`. If that line is reached, the previous `if` failed, so `main` doesn't equal `pattern`; and presumably we want to return False if `main` is nonempty but `pattern` is. With that change, if the first two `if`s fail, then both `main` and `pattern` are nonempty. – BrianO Oct 02 '15 at 06:01
0

The important bit is where the wildcard is :

  • test*
  • te*ing
  • *ing

In the first case, with the wildcard at the end, testing and test*:

def samePattern(main,pattern):
    if pattern=="*" or main==pattern:                  #  3.
        return True                                    
    if main=="":                                       #  4.
        return False                                   
   if main[0:1]==pattern[0:1]:                         #  1.
       return samePattern(main[1:],pattern[1:])        #

    if pattern[0:1]=="*":
        return samePattern(main[1:],pattern) or samePattern(main,pattern[1:])
                            ^                       ^
                            | # 2.                  | # 5.

    return False

Section 1. walks along both text and pattern for test until the pattern gets to the wildcard and they aren't the same character anymore.

Section 2. triggers, and shifts only the text along one character and tries again.

Section 3. hits, the pattern is a single '*' with nothing else, and it returns True.

Section 2. comes back out with the True value, Python's short-circuit evaluation doesn't test the other side of the or because it doesn't need to, and the whole thing collapses to True successful match.


The second case, with the wildcard in the middle starts off the same way. testing and te*ing:

Section 1. walks both text and pattern, until the pattern gets to the wildcard.

Now the text and pattern are sting and *ing, which is the same as the third case - wildcard at the front.


The third case with the wildcard at the front, e.g. testing and *ing:

Section 2. triggers because the pattern starts with a *. (It skips section 3. because it's not only a star). Section 2 moves one text character and tries again with the same pattern.

The same happens over and over until Section 4. triggers and the text has run out. Now this returns False.

Now the first False has come back from the left part of the or test, the right part is tried for the first time. This shifts the pattern past the wildcard, *ing becomes ing.

But we're still deep in the recursion, at the last character of the text - g and ing, which returns False from Section 1.

Both sides of the or test at this depth returned False so the whole line does.

This comes back up a level, and the other side of that level's or test is tried. ng and ing. False from Section 1.

Both sides of the or test at this depth returned False so the whole line does.

This comes back up a level, and the other side of that level's or test is tried. ing and ing. True from Section 3 and main==pattern.

This returns True, the or returns True and the whole thing collapses in a pile of Truth.


All the parts work together, but roughly speaking Section 1. matches text with text, Sections 2 and 3 match ending wildcards, and Sections 4 and 5 match beginning wildcards, and they overlap which takes care of wildcards-in-the-middle.


Adding some printf debugging helped me see what was going on:

first

def samePattern(main,pattern):
    print "main", main, "pattern", pattern

[..]

then

    if pattern[0:1]=="*":
        x = samePattern(main[1:],pattern)
        print "x", x
        return x or samePattern(main,pattern[1:])
Community
  • 1
  • 1
TessellatingHeckler
  • 27,511
  • 4
  • 48
  • 87